A path into AI safety for software engineers

(Photo by Lili Popper on Unsplash)

If you’re a software engineer wondering how you might move into AI safety, this is what the first concrete steps look like. Or, rather, what my first concrete steps look like. Yours may be different, but I hope this inspires you to get started on your journey. After 18 years nearly entirely focused on iOS apps, I realized that I want to have a meaningful impact on the world, and AI safety is a place where I can do that.

On Saturday, I completed BlueDot Impact’s Technical AI Safety course – 6 intense days that took up almost all my time, but it was totally worth it. We had excellent discussions over mechanistic interpretability, constitutional AI, evaluations, and practical ways to contribute to AI safety work. I enjoyed connecting and getting to know the people in my cohort.

And the capstone? Building a concrete 30-day action plan to answer the question “how can I make AI safer?”

I’ve started making open-source contributions to Inspect AI, an evaluation framework used by AI companies like Anthropic, Google DeepMind, xAI, UK AI Security Institute, SecureBio, and Apollo Research. I’m optimistic about making additional meaningful contributions over the next 30 days.

When I started looking at Inspect, I wasn’t sure what to do. I had seen a request for help in the BlueDot Impact Slack group with links to a few issues, but those had all been closed by the time I got there (which is great!). So I hunted around and thought a bit about what I’d need to get started.

It can be disorienting for me when I start working on a new project. I have all sorts of questions like these (if you’re nodding along, you’re not alone):

What’s a good first issue to start working on? Inspect actually uses a label on GitHub but there aren’t any open issues with that label.
After I choose an issue I’d like to fix, what happens next? I don’t want to work on something for a few days, only to see someone else create a pull request (PR) before I do, rendering my work meaningless.
Before I create a PR, what should I check or do? Do I need to install the pre-commit hooks? Lint, format, run tests?
Are there code formatting guidelines? Does the formatter handle all of them, or are there some I’ll need to look out for and do manually?
Do I need to do anything besides formatting and getting unit tests to pass? Does new code always need to be covered by unit tests?

So I did what any engineer would do – I documented the answers. (You document answers, too, right?)

For Inspect, I created a GitHub issue to explain all of this, suggesting a contributing guide to address it. The following day, I created the guide, doing the best I could to answer my own questions from what I’ve gleaned from the README, pull requests, and other attributes of the repository on GitHub. Then I opened a pull request that I think would be a fine addition to Inspect.

Over the next 30 days, I’ll be sharing weekly updates on my Inspect contributions – what I’m working on, challenges I hit, and practical lessons for anyone trying to break into AI safety work. No academic jargon, no assumed PhDs, just real engineering challenges and solutions.