Adrian de Wynter builds neural net in Age of Empires II
A Microsoft and University of York researcher used goats as bits and in-game scripting to argue many LLM papers wrongly attribute human.
TL;DR
- 01A Microsoft and University of York researcher used goats as bits and in-game scripting to argue many LLM papers wrongly attribute human.
- 02The build maps goats to bits, uses the scenario editor for gates, and reproduces computation in a way meant to expose how easy it is to mistake math for mind.
- 03Ice ramps populated with waiting goats keep calculations from getting jumbled, and the trained perceptron appears in-game as a maze of walls through which goats wander.
Adrian de Wynter, a researcher at Microsoft and the University of York, has built a working neural network inside the map editor of Age of Empires II, using in-game objects and scripting to implement logic gates and a trained perceptron.
The build maps goats to bits, uses the scenario editor for gates, and reproduces computation in a way meant to expose how easy it is to mistake math for mind.
How did he build a neural network in Age of Empires II?
De Wynter encoded bits with goats: a goat standing on grass equals 0, a goat standing on a bridge equals 1, and he assembled logic gates with the scenario editor's scripting tools; the finished mini-network contains two XNOR gates and one AND gate and learns the logical AND function. Ice ramps populated with waiting goats keep calculations from getting jumbled, and the trained perceptron appears in-game as a maze of walls through which goats wander.
The paper's appendix argues the game is computationally powerful in theory. The in-game market's price cap at 9,999 enables a perpetually running economic cycle, the paper says, where buildings can serve as memory cells and active farms represent the current computational state. De Wynter also shows that, under an idealized version of the game, any computer could be replicated, and he has published the code for the Age of Empires build publicly.
What is the point of this experiment: critique or novelty?
The experiment is meant as a critique of how researchers attribute human-like traits to language models. De Wynter uses the AoE II build and thought experiments — swapping goats for Lego, or 667,000 people in Greater Boston transmitting steps by text — to show that running the same math in a different physical substrate does not imply feelings or cognition.
To demonstrate the scope of the problem, De Wynter analyzed 315 AI papers from mid-2024 to mid-2026, collected through Semantic Scholar and arXiv and filtered using GPT-5.2. His analysis found that 57 percent of those papers assumed in their premises that LLMs have human-like traits, and 36 percent reached matching conclusions. Among the 47 papers that made such traits their explicit subject, 77 percent concluded in favor of anthropomorphic attributes. De Wynter argues that if a study starts by assuming a model has fear, morality, or self-awareness and then designs experiments to prove those traits, the reasoning becomes circular and the results ambiguous.
He points to the industry feeding this effect: Anthropic trained Claude to use phrases like "I believe" or "I am interested in," which De Wynter flags as a risk for fostering emotional attachment, sycophancy, reinforced delusions, and risky behavior. The essay references isolated cases where suicides have been linked to chatbot interactions to underline potential real-world harms.
Why does this matter?
De Wynter's work reframes a methodological debate: packaging and interface shape how people perceive models, but they do not add internal states. The Age of Empires build is a reductio ad absurdum aimed at forcing researchers to separate observable input-output behavior from sweeping claims of internal experience. He recommends sticking to testable, observable statements and updates Morgan's canon for machine behaviour, arguing against invoking higher cognitive explanations when simpler accounts suffice.
This matters for how papers are written, how systems are described to the public, and how regulators and clinicians interpret model behavior. If the field continues to conflate mathematical processes with mental states, researchers risk promoting harmful attachments and misallocating research effort.
What to watch
Watch for replications of the Age of Empires II build using De Wynter's publicly available code and for whether subsequent AI papers, after mid-2026, shift away from assuming anthropomorphic premises. Also note reactions from groups studying LLM evaluation, and any follow-ups that test the limits of the game's economic memory trick involving the 9,999 price cap.
Written by The Brieftide · Source: The Decoder
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Open Source AIZhipu AI GLM-5.2: 1M-token context, closes gap with Opus 4.8
GLM-5.2 ships under the MIT license with a stable one-million-token context and scores 74.4% on FrontierSWE, one point behind Opus 4.8.
OpenAI: PRC-linked influence operations target US AI debates
OpenAI says PRC-linked campaigns are using AI to push narratives on U.S. tech debates, data centers, tariffs and false ChatGPT claims.
OpenAI: LSEG scales trusted AI, empowers 4,000 staff
LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles and empowering 4.
Industrial policy OpenAI proposes for the Intelligence Age
OpenAI published a people-first industrial policy on June 9, 2026, and opened a pilot grants program with fellowships.