4 min read

Nothing from Something: Can a Language Model Discover 0?

A paper by Phoebe Zeng, Thomas L. Griffiths and Brenden M. Lake tests whether GPT-2–size models can learn the concept of zero; pretraining.

The Brieftide

TL;DR

  • 01A paper by Phoebe Zeng, Thomas L. Griffiths and Brenden M. Lake tests whether GPT-2–size models can learn the concept of zero; pretraining.
  • 02The paper appears on arXiv as arXiv:2606.17289 and places the question in the context of out-of-distribution mathematical discovery.
  • 03The paper evaluates whether modern language models can hypothesize a genuinely new mathematical concept, using simple arithmetic as the case study.

Phoebe Zeng, Thomas L. Griffiths and Brenden M. Lake submitted a paper on 15 Jun 2026 titled "Nothing from Something: Can a Language Model Discover 0?" that uses simple arithmetic to ask whether a language model can independently discover the number zero. The paper appears on arXiv as arXiv:2606.17289 and places the question in the context of out-of-distribution mathematical discovery.

What did the researchers test and how?

The paper evaluates whether modern language models can hypothesize a genuinely new mathematical concept, using simple arithmetic as the case study. The authors frame mathematical discovery as a strong form of out-of-distribution generalization and then test whether language models of a GPT-2 size can independently infer the concept of "zero" from training data and generalize at test time.

They position the experiment as probing how much models can extend beyond their training data, and whether language abilities scaffold such generalizations in neural models. The submission lists the subject areas as Artificial Intelligence and Computation and Language and provides a PDF and TeX source on arXiv.

How did GPT-2–size models perform?

GPT-2–size language models failed to perform the key generalization at test time regardless of whether they had language pretraining. The paper states plainly that "language models of a GPT-2 size are unable to perform this generalization at test time regardless of language pretraining," but that models could improve substantially after being trained on additional examples of zero.

Concretely, the authors report that models improve after being trained on "tens or hundreds of examples of zero." They also find that language pretraining reduces the number of required examples by approximately 50 percent, showing that prior language training lowers sample requirements even though it does not by itself enable the test-time generalization. Those are the central empirical takeaways the paper highlights.

Why it matters

If a common architecture and scale like GPT-2 cannot discover zero without explicit examples, that constrains claims about neural networks making genuinely novel mathematical discoveries from language pattern learning alone. The finding that pretraining cuts required examples by about 50 percent suggests language skills provide useful inductive scaffolding but are not sufficient to bridge the gap to true out-of-distribution invention. For researchers aiming to push models toward independent conceptual discovery, the result maps a concrete boundary: scale and language pretraining help, but tens or hundreds of targeted examples remain necessary for this basic mathematical notion.

What to watch

Look for follow-up work testing different model scales, architectures, or training curricula and for any papers that replace "tens or hundreds" with exact sample counts or that demonstrate discovery without additional examples. Also watch for experiments that operationalize other minimal mathematical concepts beyond zero to see whether the sample-gap and the roughly 50 percent pretraining benefit generalize.

The paper can be accessed on arXiv as arXiv:2606.17289 and was submitted on 15 Jun 2026 by Phoebe Zeng, Thomas L. Griffiths and Brenden M. Lake.

Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement