Open Source AI4 min read

In the Weights shows who is stored in LLM weights, leaderboard

The site queries multiple models, assigns a numeric "strength" score (max 996), and was built by Joey Flynn and Thomas Dimson.

The Brieftide

TL;DR

  • 01The site queries multiple models, assigns a numeric "strength" score (max 996), and was built by Joey Flynn and Thomas Dimson.
  • 02In the Weights is a website that tests whether large language models encode specific people inside their parameter weights.
  • 03In the Weights queries multiple models, merges the model outputs and assigns a single strength score to a name; the site's author and his colleague currently have scores of 175 and 262.

In the Weights is a website that tests whether large language models encode specific people inside their parameter weights. Built by Joey Flynn and Thomas Dimson, the site queries several models, combines their outputs and gives each queried name a numeric "strength" score; the leaderboard tops out at 996 for widely known figures such as Mozart, Shakespeare and Taylor Swift.

How does In the Weights decide if a person is "stored"?

In the Weights queries multiple models, merges the model outputs and assigns a single strength score to a name; the site's author and his colleague currently have scores of 175 and 262. The site treats those scores as an indicator that a model considered the person relevant enough during training to recall without external tools like web search.

The creators note that smaller models make it harder for a name to appear in results. They single out Meta's Llama as an example: appearing in Llama, which the site notes has a billion parameters, counts as a sign of high relevance. The leaderboard is used to contextualize scores, with a maximum score of 996 reserved for household names such as Mozart, Shakespeare and Taylor Swift.

What are the limits and common failure modes?

The creators explicitly flag several limitations: models can hallucinate biographical details, typos drag down strength scores and common names often produce worse results. That means a high score is not a perfect or verified biography; it is a signal about presence in training data and model attention, not a definitive fact about a person.

Those caveats matter because the tool relies on what models emit without search or external retrieval. If a model fabricates details or conflates similar names, the site will record that output and fold it into its combined score. The site therefore reflects both genuine stored knowledge and the well-known failure modes of large language models.

Why it matters

The site puts a concrete, numeric lens on a question that has been mostly qualitative: how much do LLMs "know" about individuals from their training data. A strength score gives researchers, journalists and curious individuals a simple metric to compare names across models. It also exposes trade-offs between model size and recall: according to the site's creators, smaller models are less likely to surface personal names, while appearing in a one‑billion‑parameter model like Meta's Llama signals higher relevance.

That matters for privacy and for how people understand model behavior. If a name appears with a nontrivial strength score without any web search, the simplest interpretation is that the training mix included references to that person. At the same time, the documented failure modes make clear that score alone should not be treated as evidence of factual knowledge about private individuals.

What to watch

See whether the site expands the roster of models it queries and whether its scoring adjusts to reduce noise from typos and common-name collisions. The creators have already documented limits such as hallucinations and typographic sensitivity; changes to the model set or scoring method would be the clearest signals that the tool is moving from exploratory demo toward a more robust measurement.

The site offers an accessible way to probe what models recall from their weights, while also serving as a reminder that model outputs require context and skepticism. Specific data points from the site — the author's and his colleague's strength scores of 175 and 262 and the leaderboard cap of 996 — provide concrete anchors for follow-up testing and reporting.

How In the Weights works: core elements
In the WeightsStrength scoreModels queriedCreatorsLeaderboardLimitationsSignal
Advertisement

Written by The Brieftide · Source: The Decoder

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement