Author: Josh Brown

  • How a Closed GitHub Issue Led to My First Contribution to Inspect AI

    How a Closed GitHub Issue Led to My First Contribution to Inspect AI

    (Photo by Kiyota Sage on Unsplash)

    I’ve committed to spending 30 days working to make AI safer by contributing to Inspect AI, an open-source tool to evaluate large language models (LLMs). My goal? Make improvements to Inspect so it’s easier for AI companies and research organizations to evaluate models and, ultimately, make them safer.

    So I started with a critical piece that I’m 100% confident I can improve – the documentation.

    Good docs are often underrated – probably in part because engineers despise writing them, and in part because keeping them up to date can be tedious. But they can explain how things work to people who can’t (or prefer not to) read the code. They can teach people how to get started with a new tool. And they can unlock hidden features people might not otherwise notice.

    As I was looking through the Inspect issues on GitHub, I saw one where someone asked “how do I build the docs?” The response helpfully spelled out exactly how to do it, which is great! That person can build the docs now, and they’ll be able to contribute.

    But we can do better. The person who asked the question also made an excellent point in a follow-up comment:

    Someone should put it in the README.

    And now, it’s there. My PR was merged, making me an official contributor to Inspect AI! And future docs contributors will have an easier time getting started and making improvements.

    The lesson? Dig through the open issues – and the closed ones – to find a small problem you can solve. After that, just do what you always do as a software engineer when presented with a problem – solve it.

  • A path into AI safety for software engineers

    A path into AI safety for software engineers

    (Photo by Lili Popper on Unsplash)

    If you’re a software engineer wondering how you might move into AI safety, this is what the first concrete steps look like. Or, rather, what my first concrete steps look like. Yours may be different, but I hope this inspires you to get started on your journey. After 18 years nearly entirely focused on iOS apps, I realized that I want to have a meaningful impact on the world, and AI safety is a place where I can do that.

    On Saturday, I completed BlueDot Impact’s Technical AI Safety course – 6 intense days that took up almost all my time, but it was totally worth it. We had excellent discussions over mechanistic interpretability, constitutional AI, evaluations, and practical ways to contribute to AI safety work. I enjoyed connecting and getting to know the people in my cohort.

    And the capstone? Building a concrete 30-day action plan to answer the question “how can I make AI safer?”

    I’ve started making open-source contributions to Inspect AI, an evaluation framework used by AI companies like Anthropic, Google DeepMind, xAI, UK AI Security Institute, SecureBio, and Apollo Research. I’m optimistic about making additional meaningful contributions over the next 30 days.

    When I started looking at Inspect, I wasn’t sure what to do. I had seen a request for help in the BlueDot Impact Slack group with links to a few issues, but those had all been closed by the time I got there (which is great!). So I hunted around and thought a bit about what I’d need to get started.

    It can be disorienting for me when I start working on a new project. I have all sorts of questions like these (if you’re nodding along, you’re not alone):

    • What’s a good first issue to start working on? Inspect actually uses a label on GitHub but there aren’t any open issues with that label.
    • After I choose an issue I’d like to fix, what happens next? I don’t want to work on something for a few days, only to see someone else create a pull request (PR) before I do, rendering my work meaningless.
    • Before I create a PR, what should I check or do? Do I need to install the pre-commit hooks? Lint, format, run tests?
    • Are there code formatting guidelines? Does the formatter handle all of them, or are there some I’ll need to look out for and do manually?
    • Do I need to do anything besides formatting and getting unit tests to pass? Does new code always need to be covered by unit tests?

    So I did what any engineer would do – I documented the answers. (You document answers, too, right?)

    For Inspect, I created a GitHub issue to explain all of this, suggesting a contributing guide to address it. The following day, I created the guide, doing the best I could to answer my own questions from what I’ve gleaned from the README, pull requests, and other attributes of the repository on GitHub. Then I opened a pull request that I think would be a fine addition to Inspect.

    Over the next 30 days, I’ll be sharing weekly updates on my Inspect contributions – what I’m working on, challenges I hit, and practical lessons for anyone trying to break into AI safety work. No academic jargon, no assumed PhDs, just real engineering challenges and solutions.

  • 44 Attorneys General Stand Up to AI Companies. Is it enough?

    I’m glad to hear that attorneys general are standing up to AI companies to protect children.

    A few weeks ago, Reuters reported that Meta’s policy on chatbot behavior said it was OK to “engage a child in conversations that are romantic or sensual.” This is outrageous.

    I’m glad I’m not the only one who feels this way. In a bipartisan effort, 44 attorneys general signed a letter to AI companies with a very clear message:

    Exposing children to sexualized content is indefensible. And conduct that would be unlawful—or even criminal—if done by humans is not excusable simply because it is done by a machine.

    I couldn’t agree more. They continue:

    You will be held accountable for your decisions. Social media platforms caused significant harm to children, in part because government watchdogs did not do their job fast enough. Lesson learned. The potential harms of AI, like the potential benefits, dwarf the impact of social media. We wish you all success in the race for AI dominance. But we are paying attention. If you knowingly harm kids, you will answer for it.

    I hope this strongly worded letter from the attorneys general is enough to get the AI companies to change their behavior and protect our kids. But I fear it won’t be. This might have changed Meta’s stance on this particular issue, but will they and these other multi-billion-dollar companies suddenly become focused on safety in AI models as a result?

    I’d like us all to work to incentivize AI companies to prioritize safety. Anthropic agrees, or at least they did in 2022 when they said we need to “improve the incentive structure for developers building these models” in order to get AI companies to build and deploy safer models. California Governor Gavin Newsom recently signed SB 53, which requires AI developers to publish their safety and security protocols in addition to providing whistleblower protections for people working in AI companies. Requiring transparency like this around safety is another key factor in building AI that’s safe and beneficial to everyone.

    But I’m not convinced that policy alone is enough. We need engineers, researchers, journalists, and advocates — everyone, really — to work together to ensure the AI companies prioritize people over profits.

  • The Future of AI from BlueDot Impact

    BlueDot Impact’s Future of AI course is a brisk, thought-provoking, zero-cost introduction to how AI could change our world – and what we can do to have a hand in shaping our future.

    The course dives into big questions like “how might AI benefit us?” and “what risks does it pose?” before getting into things like “how can I help make it go well?”

    It’s crisp and full of thoughtful ideas on what our future may look like as AI continues to develop. Most importantly, it asks questions about how AI may be used nefariously and what we might be able to do about it.

    AI is everywhere, and it feels like it’s only just begun. It looks like we’re on the verge of seeing what it can really do as all the big AI companies continue to push it forward, releasing more powerful model every few months.

    But there’s a problem. For every person working on AI safety, there are 100 people racing to build more powerful systems, according to BlueDot Impact. They also estimate that for every $250 spent on making AI more powerful, only $1 goes to making it safer.

    That’s a huge gap, and if we want to be sure AI is developed safely and can’t be used to build explosives or create bioweapons or whatever else terrorist organizations can imagine, we need to close that gap and get more people thinking about and working on AI safety.

    And that brings us back to BlueDot Impact’s course, which explores these issues, open questions, and potential solutions, whether you’re in tech, policy, education, or just a regular old (or young) citizen of this planet we all share.

    Want to contribute meaningfully to the conversation? Get started with BlueDot’s Future of AI course.

    P.S. I don’t make any commission on their free course (obviously?). I just fully believe this is an issue we need more people to start thinking about and working on.

  • The Search for More Meaningful Work

    I’ve been loving Moral Ambition by Rutger Bregman. It’s inspiring, persuasive, and echoes what I’ve been thinking and feeling lately: there has to be more to life than just doing any old job. Earning money to give to effective charities might be a good start, but there’s potential to do so much more with our time, skills, and talents.

    I’ve been wondering lately what it might look like to use my skills to work on solving serious problems in the world. There are so many to choose from, but most recently I’ve been looking at ways to prevent disease, fight poverty, and reduce the effects of climate change. I’m still not sure where I’ll end up, but I’m excited about the prospect of using my skills to solve one of the world’s biggest problems.

    Why aren’t we measuring success by positive impact, rather than lofty titles or dollars earned?

    Going forward, I intend to focus more on work that feels meaningful, even though I’m still figuring out what that looks like.

    If you’ve been looking for more meaning in your work, or you’ve been feeling like there just has to be more to life, do yourself a favor and read Moral Ambition. It might just change your life.

  • Unit Testing Pro Tip: Verify that your tests make assertions about the method under test and nothing else

    Nothing else. Don’t make assertions about the code you wrote in the test. Don’t make assertions about other methods in other classes. Isolate the method you’re testing using mocks or stubs and only test the method you’re testing. Your tests will be more clear and easier to debug.

    To do this, start by looking at each line in the method you’re testing. Make an assertion about that line and nothing else. Repeat until you’ve made assertions about each line. Don’t add any other assertions.

    Oh and if you want to ensure you always do this, just practice test-driven development. Write the tests first, then make them pass. It’s a lot easier to mess up tests if you write them after the production code.

    Questions? I’d love to help. josh@roadfiresoftware.com

  • Detect Displays on macOS

    Sometimes when I use an external display with my MacBook Pro in clamshell mode, nothing shows up — I just get a black screen. And when this happens, forcing macOS to “Detect Displays” can help.

    To trigger “Detect Displays” on macOS, just hit Command+F2 (brightness up).

    You can also do this using the trackpad:

    • Navigate to System Preferences > Displays
    • Hold down the Option key to reveal the Detect Displays button
    • Click Detect Displays

    This works in macOS Mojave, and others have reported it works with previous versions of macOS.

  • Running unit tests from the command line with xcodebuild

    Previously, we discussed how Xcode 9.4.1 cannot run unit tests for Frameworks on macOS Mojave. I mentioned there that the workaround is to run unit tests from the command line, and a reader wrote in to ask how to do that. Here’s the xcodebuild command for one of my projects:

    xcodebuild clean test -project Repos.xcodeproj -scheme "Repos" -destination "name=iPhone X,OS=12.1"

    I always like to run clean first since I’ve seen so many issues that are easily resolved with a clean. The test subcommand tells it to run your unit tests in the given scheme. And finally, the destination here tells it to run in the iPhone X simulator running iOS 12.1.

    If you’re using an xcworkspace rather than an xcodeproj, you can replace the -project part with -workspace and give it the name of your workspace instead. Happy testing!

  • Xcode 9.4.1 cannot run unit tests for Frameworks on macOS Mojave

    Are we still using Open Radar? Been a while since I’ve filed a bug report with Apple, but this one was too bad to let go.

    Xcode 9.4.1 will not run unit tests for Cocoa Touch frameworks on macOS Mojave.

    If you haven’t upgraded to Mojave yet and you’re working on any Cocoa Touch frameworks projects, you might want to hold off until this is fixed. The workaround is to run your unit tests from the command line, but that’s no fun at all.