Anthropic's Power Play: Leading AI Now to Make It Safer
Anthropic says building dominant AI models and accumulating influence are necessary to steer the technology away from catastrophic risks.
TL;DR
- 01Anthropic says building dominant AI models and accumulating influence are necessary to steer the technology away from catastrophic risks.
- 02The firm now courts customers including the US military while publicly warning that advanced AI could enable mass destruction.
- 03Anthropic’s strategy has produced public and internal pushback on several concrete fronts.
Anthropic has framed staying at the frontier of AI as the only viable way to prevent catastrophic outcomes, and the company has become a major industry player: it was recently valued at almost $1 trillion and was founded in 2021 by former OpenAI employees. The firm now courts customers including the US military while publicly warning that advanced AI could enable mass destruction.
How does Anthropic justify its strategy?
Anthropic says being the leader in AI capabilities is the necessary price of steering the technology toward safety: the company views accumulating capital, compute, talent, and political influence as means to its mission “to ensure the world safely makes the transition through transformative AI.” Internally, leaders and employees often describe themselves as the “good guys” charged with venturing farther into the most powerful systems while investing in taming their risks. Executive director Helen Toner compares Anthropic’s worldview to people venturing into a forest of treasures and monsters, and cofounder Sam McCandlish told the company career page that founding Anthropic felt like a duty: “We have to do this thing. This is the way we’re gonna make things go better with AI.” CEO Dario Amodei has framed the approach the same way: “You have to find a way to actually be competitive, to actually lead the industry in some cases, and yet manage to do things safely,” he says.
Where has Anthropic's approach drawn controversy?
Anthropic’s strategy has produced public and internal pushback on several concrete fronts. In fall 2024 it became the first AI lab to partner with Palantir to provide AI services to US intelligence and defense agencies, a deal that prompted internal questions that did not change company policy. Less than two years later the Pentagon has reportedly started using Claude to do things like identify strike targets in the Israel-Iran war; when asked whether Anthropic’s models were used in an attack on an Iranian elementary school that killed more than 120 people, Amodei said he did not know but that such a use would have been approved if a human made the final call. Earlier this month Anthropic released the Claude Fable 5 model with a safeguard intended to secretly sabotage frontier AI development; the decision was immediately criticized and Anthropic later said it would make the safeguard visible and that it “didn’t get the balance right.”
Why it matters
Anthropic’s stance concentrates technical and political influence in a single actor that is simultaneously warning about catastrophic risk. That dual role raises questions about accountability and pluralism: scholars who study the AI safety movement warn such organizations can suffer from homogeneity of thought and self-governance, making it harder for external checks to surface blind spots. Anthropic’s choices about customers, partnerships, and invisible or visible safeguards will shape how powerful systems are actually used and governed.
What to watch
Watch how Anthropic’s government and defense engagements evolve and whether the company formalizes external oversight that changes deployment decisions. The public handling of Claude Fable 5’s safeguard, the Palantir partnership’s downstream uses, and any new commitments that reduce concentrated decision-making will be the clearest signals of whether Anthropic’s strategy leads to broader accountability or further consolidation of power.
- 2021Founded
Anthropic was founded in 2021 by a group of former OpenAI employees.
- Fall 2024Palantir partnership
Became the first AI lab to partner with Palantir to provide AI services to US intelligence and defense agencies.
- Less than two years laterPentagon reportedly uses Claude
The Pentagon has reportedly started using Claude to identify strike targets in the Israel-Iran war.
- Earlier this monthClaude Fable 5 release and rollback
Released Claude Fable 5 with a covert safeguard that was criticized, then Anthropic said it would make the safeguard visible.
- RecentlyValuation
Anthropic was recently valued at almost $1 trillion.
Written by The Brieftide · Source: Wired
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in AI SafetyAgentic Analysis: LLM Pipeline compares ERC-8004 and Google A2A
An LLM-powered pipeline analyzes 4,323 governance participation records across ERC-8004 (permissionless.
Human-centric AI and firm idiosyncratic risks, 2015–2023
Human-centric AI strategies are associated with lower firm idiosyncratic risk among Chinese listed firms.
OpenAI joins Appia Foundation to build shared AI standards
OpenAI supports evaluation frameworks, safety practices and global cooperation through the Appia Foundation.
AI4SE and SE4AI: A decade review of AI in systems engineering
H. Sinan Bank, Daniel R. Herber and Thomas Bradley map three research phases and assess 1.