Elizabeth (Beth) Barnes

AI evaluations researcher.

Formerly at DeepMind and OpenAI. Founder & Head of Research at METR. Developing evaluations of dangerous capabilities so we know if we're getting close to very risky AI.

Some research highlights:

Resources for Autonomy Evaluations - task suite, evaluation protocol, estimates of the "elicitation gap"
Evaluating LLM Agents on Realistic Autonomous Tasks
Evaluating LLMs trained on code (alignment section)
Obfuscated arguments problem - a problem with recursive-decomposition-based alignment approaches
"Imitative generalisation" - explainer for Paul Christiano's 'Learning the Prior'
Risks from AI persuasion - thoughts on the likelihood and consequences of superhuman persuasion before AGI
Reflection mechanisms as an alignment target - work done by my AI safety camp mentees surveying Mechanical Turkers on their feelings towards different reflection mechanisms

I sometimes post alignment-related thinking here.

Contact me at: beth dot m dot surname at gmail.com

If you have any feedback for me, I'd love to hear it. You can submit it anonymously (or pseudonymously) here.