Anthropic CEO Dario Amodei on Being an Underdog, AI Safety, and Economic Inequality

Hanging on the wall of Anthropic’s offices in San Francisco in early May, a stone’s throw from the conference room where CEO Dario Amodei would shortly sit for an interview with TIME, was a framed meme. Its single panel showed a giant robot ransacking a burning city. Underneath, the image’s tongue-in-cheek title: Deep learning is hitting a wall. That’s a refrain you often hear from AI skeptics, who claim that rapid progress in artificial intelligence will soon taper off. But in the image, an arrow points to the robot, labeling it “deep learning.” Another points to the devastated city: “wall.”

[time-brightcove not-tgx=”true”]

The cheeky meme sums up the attitude inside Anthropic, an AI company where most employees seem to believe that AI progress isn’t slowing down—and could potentially result in dangerous outcomes for humanity. Amodei, who was on the cover of TIME last month as his company was named one of TIME’s 100 Most Influential Companies of 2024, says Anthropic is devoted to studying cutting-edge AI systems and developing new safety methods. And yet Anthropic is also a leading competitor to OpenAI, releasing powerful tools for use by the public and businesses, in what even Amodei himself has worried might be a perilous race that could end badly.

On June 20, Anthropic fired its latest broadside in that race, releasing the latest version of its Claude chatbot: Claude 3.5 Sonnet. The model, according to the company, sets new industry standards on reasoning, coding and some types of math, beating GPT-4o, the most recent offering from OpenAI. “With today’s launch, we’re taking a step towards what we believe could be a significant shift in how we interact with technology,” Amodei said in a statement on June 20. “Our goal with Claude isn’t to create an incrementally better [large language model] but to develop an AI system that can work alongside people and software in meaningful ways.”

TIME’s cover story about Amodei last month delved into the question of whether Anthropic can succeed at its safety mission while under pressure to compete with OpenAI, as well as tech giants like Microsoft, Google and Amazon—the latter two of whom are significant investors in Anthropic. Today, TIME is publishing a longer excerpt from the interview conducted in early May for that piece. It has been condensed and edited for clarity.

Anthropic is the youngest “frontier” AI lab. It’s the smallest. Its competitors have more cash. And in many ways, it’s the most expressly committed to safety. Do you think of yourselves as underdogs?

Certainly all the facts you cite are true. But it’s becoming less true. We have these big compute deals. We have high single-digit billions in funding. Whereas our competitors have low double-digits to, somewhat frighteningly, triple-digits. So I’m starting to feel a little bit awkward about calling ourselves the underdogs. But relative to the heavyweights in the field, I think it’s absolutely true.

The AI “scaling laws” say that as you train systems with more computing power and data, they become predictably more capable, or powerful. What capabilities have you seen in systems that you’re not able to release yet?

We released our last model relatively recently, so it’s not like there’s a year of unreleased secrets. And to the extent that there are, I’m not going to go into them. But we do know enough to say that the progress is continuing. We don’t see any evidence that things are leveling off. The reality of the world we live in is that it could stop any time. Every time we train a new model, I look at it and I’m always wondering—I’m never sure in relief or concern—[if] at some point we’ll see, oh man, the model doesn’t get any better. I think if [the effects of scaling] did stop, in some ways that would be good for the world. It would restrain everyone at the same time. But it’s not something we get to choose—and of course the models bring many exciting benefits. Mostly, it’s a fact of nature. We don’t get to choose, we just get to find out which world we live in, and then deal with it as best we can.

What did you set out to achieve with Anthropic’s culture?

We have a donation matching program. [Anthropic allows employees to donate up to 25% of their equity to any charity, and will match the donation.] That’s inclined to attract employees for whom public benefit is appealing. Our stock grants look like a better deal if that’s something you value. It doesn’t prevent people from having financial incentives, but it helps to set the tone.

In terms of the safety side of things, there’s a little bit of a delta between the public perception and what we have in mind. I think of us less as an AI safety company, and more think of us as a company that’s focused on public benefit. We’re not a company that believes a certain set of things about the dangers that AI systems are going to have. That’s an empirical question. I more want Anthropic to be a company where everyone is thinking about the public purpose, rather than a one-issue company that’s focused on AI safety or the misalignment of AI systems. Internally I think we’ve succeeded at that, where we have people with a bunch of different perspectives, but what they share is a real commitment to the public purpose.

There’s a real chance that Donald Trump gets elected at the end of this year. What does that mean for AI safety?

Look, whoever the next president is, we’re going to work with them to do the best we can to explain both that the U.S. needs to stay ahead of its adversaries in this technology, but also that we need to provide reasonable safeguards on the technology itself. My message is going to be the same. Obviously different administrations are going to have different views, and I expect different [government] policies depending on the outcome of the election. All I can really do is say what I think is true about the world.

One of our best methods for making AI systems safer is “reinforcement learning,” which steers models in a certain direction—to be helpful and harmless—but doesn’t remove their dangerous capabilities, which can still be accessible through techniques like jailbreaking. Are you worried by how flimsy that method seems?

I don’t think it’s inherently flimsy—I would more say that we’re early in the science of how to steer these systems. Early in that they hallucinate, early in that they can be jailbroken. Early in the sense that, every time we want to make our model more friendly sounding, then it’ll turn out that in some other context, it gives responses that are too long. You’re trying to make it do all the good things, and not all the bad things, but that’s just hard to do. I think it’s not perfect, and I think it’ll probably never be perfect. We have a Swiss cheese model: you can have one layer that has holes in it, but if you have 10 layers of Swiss cheese, it’s hard for anything to fly through all 10 layers. There’s no silver bullet that’s going to help us steer the models. We’re going to have to put a lot of different things together.

Is it responsible to use a strategy like that if the risks are so high?

I’d prefer to live in an ideal world. Unfortunately in the world we actually live in, there’s a lot of economic pressure, there’s not only competition between companies, there’s competition between nations. But if we can really demonstrate that the risks are real, which I hope to be able to do, then there may be moments when we can really get the world to stop and consider for at least a little bit. And do this right in a cooperative way. But I’m not naive—those moments are rare, and halting a very powerful economic train is something that can only be done for a short period of time in extraordinary circumstances.

Meta is open-sourcing models that aren’t too far behind yours, in terms of capabilities. What do you make of that strategy? Is it responsible?

Our concern is not primarily about open-source models, it’s about powerful models. In practice, what I think will happen with most of these companies—I suspect it will happen with Meta, but I’m not certain—is that they’ll stop open-sourcing below the level where I have concerns. I ultimately suspect that open-source as an issue is almost a red herring. People treat open-source models as if they undermine our business model. The truth is, our competitors are anyone who can produce a powerful model. When someone hosts one of these models—and I think we should call them open weights models—they get put on the cloud and the economics are the same as everything else.

Microsoft recently said it is training a large language model in-house that might be able to compete with OpenAI, which they’ve invested in. Amazon, which is one of your backers, is doing the same. So is Google. Are you concerned that the smaller AI labs, which are currently out in front, might just be in a temporary lead against these much better-resourced tech companies?

I think a thing that has been prominent in our culture has been “do more with less.” We always try to maintain a situation where, with less computational resources, we can do the same or better than someone who has many more computational resources. So in the end, our competitive advantage is our creativity as researchers, as engineers. I think increasingly in terms of creative product offerings, rather than pure compute. I think you need a lot of pure compute. We have a lot, and we’re gonna get more. But as long as we can use it more efficiently, as long as we can do more with less, then in the end, the resources are going to find their way to the innovative companies.

Anthropic has raised around $7 billion, which is enough money to pay for the training of a next-generation model, which you’ve said will likely cost in the single-digit billions of dollars. To train the generation after that, you’re starting to have to look at raising more cash. Where do you think that will come from?

I think we’re pretty good for the one after the next one as well, although in the history of Anthropic, we’re going to need to raise more. It’s hard to say where that [will come from]. There are traditional investors. There are large compute providers. And there are miscellaneous other sources that we may not have thought of.

One source of funding that doesn’t get talked about very much is the government. They have this scale of money—it might be politically difficult, but is that something you’ve thought about, or even had conversations about?

The technology is getting powerful enough that government should have an important role in its creation and deployment. This is not just about regulation. This is about national security. At some point, should at least some aspects of this be a national project? I certainly think there are going to be lots of uses for models within government. But I do think that as time goes on, it’s possible that the most powerful models will be thought of as national-security-critical. I think governments can become much more involved, and I’m interested in having democratic governments lead this technology in a responsible and accountable way. And that could mean that governments are heavily involved in the building of this technology. I don’t know what that would look like. Nothing like that exists today. But given how powerful the thing we’re building is, should individuals or even companies really be at the center of it once it gets to a certain strength? I could imagine in a small number of years—two, three, five years—that could be a very serious conversation, depending on what the situation is.

Do you think the current level of inequality in the world is too much?

I would say so. We have billions of people who still live on less than $1 per day. That just doesn’t seem good.

One of the things you hear from someone like Sam Altman is, if we can raise the floor for those people, then he’s comfortable with the existence of trillionaires. How do you think about the power imbalance that comes with that level of inequality?

There’s two separate things there. There’s lifting the floor, and then there’s power within the world. I think I’m a little less sanguine about this concentration of power. Even if we materially provide for everyone, it’s hard for democracies and democratic decision-making to exist and to be fair, if power is concentrated too much. Financial concentration of power is one form of power, and there are others. But there were periods in U.S. history—look at the Gilded Age—where industrialists essentially took over government. I think that’s probably too extreme. I’m not sure the economy can function without some amount of inequality. But I’m concerned we might be in one of those eras, where it’s trending towards being too extreme to be healthy for a democratic polity.

Ideas around guaranteed basic income—if we can’t think of anything better, I certainly think that’s better than nothing. But I would much prefer a world in which everyone can contribute. It would be kind of dystopian if there are these few people that can make trillions of dollars, and then the government hands it all out to the unwashed masses. It’s better than not handing it out, but I think it’s not really the world we want to aim for. I do believe that if the exponential [rate of AI progress] is right, AI systems will be better than most humans, maybe all humans, at doing most of the things humans do. And so we’re really going to need to rethink a lot. In the short run, I always talk about complementarity—we should make sure humans and AIs work together. That’s a great answer in the short run. I think in the long run, we’re really going to need to think about, how do we organize the economy, and how humans think about their lives? One person can’t do that. One company can’t do that. That’s a conversation among humanity. And my only worry is, if the technology goes fast, we’ll have to figure it out fast.