AI Safety5 min read

Anthropic's Power Play: Leading AI Now to Make It Safer

Anthropic says building dominant AI models and accumulating influence are necessary to steer the technology away from catastrophic risks.

The Brieftide

TL;DR

  • 01Anthropic says building dominant AI models and accumulating influence are necessary to steer the technology away from catastrophic risks.
  • 02The firm now courts customers including the US military while publicly warning that advanced AI could enable mass destruction.
  • 03Anthropic’s strategy has produced public and internal pushback on several concrete fronts.

Anthropic has framed staying at the frontier of AI as the only viable way to prevent catastrophic outcomes, and the company has become a major industry player: it was recently valued at almost $1 trillion and was founded in 2021 by former OpenAI employees. The firm now courts customers including the US military while publicly warning that advanced AI could enable mass destruction.

How does Anthropic justify its strategy?

Anthropic says being the leader in AI capabilities is the necessary price of steering the technology toward safety: the company views accumulating capital, compute, talent, and political influence as means to its mission “to ensure the world safely makes the transition through transformative AI.” Internally, leaders and employees often describe themselves as the “good guys” charged with venturing farther into the most powerful systems while investing in taming their risks. Executive director Helen Toner compares Anthropic’s worldview to people venturing into a forest of treasures and monsters, and cofounder Sam McCandlish told the company career page that founding Anthropic felt like a duty: “We have to do this thing. This is the way we’re gonna make things go better with AI.” CEO Dario Amodei has framed the approach the same way: “You have to find a way to actually be competitive, to actually lead the industry in some cases, and yet manage to do things safely,” he says.

Where has Anthropic's approach drawn controversy?

Anthropic’s strategy has produced public and internal pushback on several concrete fronts. In fall 2024 it became the first AI lab to partner with Palantir to provide AI services to US intelligence and defense agencies, a deal that prompted internal questions that did not change company policy. Less than two years later the Pentagon has reportedly started using Claude to do things like identify strike targets in the Israel-Iran war; when asked whether Anthropic’s models were used in an attack on an Iranian elementary school that killed more than 120 people, Amodei said he did not know but that such a use would have been approved if a human made the final call. Earlier this month Anthropic released the Claude Fable 5 model with a safeguard intended to secretly sabotage frontier AI development; the decision was immediately criticized and Anthropic later said it would make the safeguard visible and that it “didn’t get the balance right.”

Why it matters

Anthropic’s stance concentrates technical and political influence in a single actor that is simultaneously warning about catastrophic risk. That dual role raises questions about accountability and pluralism: scholars who study the AI safety movement warn such organizations can suffer from homogeneity of thought and self-governance, making it harder for external checks to surface blind spots. Anthropic’s choices about customers, partnerships, and invisible or visible safeguards will shape how powerful systems are actually used and governed.

What to watch

Watch how Anthropic’s government and defense engagements evolve and whether the company formalizes external oversight that changes deployment decisions. The public handling of Claude Fable 5’s safeguard, the Palantir partnership’s downstream uses, and any new commitments that reduce concentrated decision-making will be the clearest signals of whether Anthropic’s strategy leads to broader accountability or further consolidation of power.

Key moments in Anthropic's recent history
  1. 2021
    Founded

    Anthropic was founded in 2021 by a group of former OpenAI employees.

  2. Fall 2024
    Palantir partnership

    Became the first AI lab to partner with Palantir to provide AI services to US intelligence and defense agencies.

  3. Less than two years later
    Pentagon reportedly uses Claude

    The Pentagon has reportedly started using Claude to identify strike targets in the Israel-Iran war.

  4. Earlier this month
    Claude Fable 5 release and rollback

    Released Claude Fable 5 with a covert safeguard that was criticized, then Anthropic said it would make the safeguard visible.

  5. Recently
    Valuation

    Anthropic was recently valued at almost $1 trillion.

Advertisement

Written by The Brieftide · Source: Wired

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

More in AI Safety
Advertisement