AI alignment researcher.
Formerly at DeepMind and OpenAI. Now at ARC. Currently trying to figure out how we'll know when we're close to dangerous AI, and how to detect misalignment and deceptive alignment.
I sometimes post alignment-related thinking here. Highlights:
Contact me at: beth dot m dot surname at gmail.com
If you have any feedback for me, I'd love to hear it. You can submit it anonymously (or pseudonymously) here.