Quotations and Sourcing4 min read

NYT says Microsoft built supercomputer to help OpenAI steal

The New York Times moved to amend its copyright suit, alleging Microsoft designed a bespoke supercomputing system to train models on Times.

The Brieftide

TL;DR

  • 01The New York Times moved to amend its copyright suit, alleging Microsoft designed a bespoke supercomputing system to train models on Times.
  • 02The request follows a recent Supreme Court decision that raised the legal bar for contributory infringement, so the Times says it is aligning its claim with that changed standard.
  • 03"Microsoft actively encouraged OpenAI to steal our copyrighted works," Times spokesperson Graham James said in a statement included in the filing.

In a heavily redacted court filing Thursday, the New York Times asked to amend its copyright complaint against OpenAI and Microsoft to clarify a contributory-infringement claim and to allege that Microsoft actively encouraged OpenAI to steal Times works by building a bespoke supercomputing system ranked among the most powerful in the world. The request follows a recent Supreme Court decision that raised the legal bar for contributory infringement, so the Times says it is aligning its claim with that changed standard.

What is the Times alleging about Microsoft’s supercomputer?

The Times alleges Microsoft built an "unusually complex" supercomputing system specifically to help OpenAI train large language models on copyrighted works without permission, and that the system was curated to disproportionately feature Times works. The amended complaint says Microsoft helped select the works infringed and provided a means to seize copyrighted material; the Times also alleged Microsoft’s deployment of Times-trained models "helped boost its market capitalization by a trillion dollars in the past year alone."

The newspaper characterizes the machine as tailor-made rather than a generic cloud service, and it says the two companies hoped to train models on the "highest-quality journalism possible" so outputs could mimic that level of writing. "Microsoft actively encouraged OpenAI to steal our copyrighted works," Times spokesperson Graham James said in a statement included in the filing.

What evidence does the Times offer in discovery?

The Times pointed to model outputs and discovery material, including a huge chunk of users' ChatGPT sessions, as some of its strongest evidence of substitution and copying. The complaint includes side-by-side comparisons and screenshots where models produced near-verbatim excerpts of Times articles. It also cites examples in which users asked for the "next paragraph" to skirt paywalls and received significant chunks of text, and instances where models "simply spit out several paragraphs" without any such prompt.

Discovery also contains examples the Times calls hallucinations: models falsely attributing quotes and fabricating nonexistent Times reporting, including a claimed article linking non-Hodgkin's lymphoma to orange juice. The Times has asked the court for permanent injunctive relief and extensive damages, arguing the defendants have wrongfully profited from copyrighted works they do not own.

Microsoft pushed back in a statement to Ars, calling the amended complaint "a last-ditch effort by the plaintiff to save its claim from unfavorable precedent set in other recent rulings." OpenAI reiterated that its models are "trained on publicly available data, and are grounded in fair use," according to spokesperson Drew Pusateri, language the companies have used previously.

Why it matters

The amendment comes after a Supreme Court decision that raised the standard for contributory infringement by requiring plaintiffs to show intentional inducement of illegal conduct. The Times is reworking its allegations to meet that standard, arguing Microsoft moved beyond passive cloud services to active design and encouragement. If the Times can show concrete substitution and market harm, courts could be asked to decide whether training state-of-the-art LLMs on proprietary journalism is lawful. In the most extreme outcome the Times outlines, a ruling against the defendants could require OpenAI and Microsoft to remove or retrain models built on the contested data.

What to watch

Watch the judge’s ruling on the Times' motion to amend and whether the court permits the revised contributory-infringement claim without additional discovery: the Times told the court that "The Times does not seek any additional discovery in support of its amended claims." Also monitor whether the discovery examples the Times disclosed—side-by-side outputs and the chunk of ChatGPT sessions—survive motions to seal or exclusion and make it into the public record, since those examples are central to the market-harm argument.

The case follows the Times' initial 2023 suit, when it became the first major publisher to sue OpenAI over training on newspaper content, and it rests now on how courts interpret contributory infringement and fair use in AI training contexts.

Key dates in the NYT lawsuit and amendment
  1. 2023
    Times sues OpenAI

    The New York Times became the first major publisher to sue OpenAI, alleging illegal training on its articles.

  2. After Supreme Court ruling
    New contributory-infringement standard

    A Supreme Court decision siding with Cox Communications raised the bar for contributory infringement by requiring proof of intentional inducement.

  3. Thursday
    Times files motion to amend complaint

    The Times filed a heavily redacted motion to clarify a contributory-infringement claim and to allege Microsoft built a bespoke supercomputer to aid infringement.

Advertisement

Written by The Brieftide · Source: Ars Technica

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement