Multimodal AIJune 18, 20264 min read

User as Engram: Per-user memory edits, arXiv paper 2026 by Bojie

Bojie Li proposes storing per-user facts as local Engram table edits, claiming a roughly 33.

The BrieftideJune 18, 2026

TL;DR

01Bojie Li proposes storing per-user facts as local Engram table edits, claiming a roughly 33.
02Beyond that mechanism the design separates two problems most personalization systems conflate: content and reasoning skill.
03User as Engram reframes personalization as a storage-layout problem inside the model rather than an external retrieval or global-weight problem.

Bojie Li submitted a paper titled "User as Engram: Internalizing Per-User Memory as Local Parametric Edits" to arXiv on 17 Jun 2026, arguing that per-user facts belong inside a hash-keyed Engram table while shared reasoning skill belongs in one adapter. The paper presents a layered design that stores content as local edits and carries reasoning ability in a shared adapter, and reports quantitative comparisons to per-user LoRA and retrieval pipelines.

What is the User as Engram design and how does it work?

User as Engram stores a user's content as surgical edits to the hash-keyed memory table of an Engram model, while one shared adapter holds reasoning skill; writing a fact turns on an exact lookup and otherwise leaves every other position unchanged. The paper describes each edit as a local parametric modification that "switches on its lookup at exactly the trigger, adds the value the answer needs, leaves every other position unchanged to the last bit, and fails if written into the wrong layer." Content rows land in disjoint hash slots so multiple users' facts compose additively and losslessly inside one shared table.

Beyond that mechanism the design separates two problems most personalization systems conflate: content and reasoning skill. The Engram approach places episodic, per-user content into sparse local rows analogous to an engram, and carries the model's shared interpretive skill in a single adapter, matching the brain metaphor the paper advances.

How does User as Engram compare to per-user LoRA and retrieval pipelines?

User as Engram aims to match per-user LoRA's direct recall while avoiding LoRA's global weight contamination, and the paper reports concrete performance and cost differences: writing facts as local Engram rows yields a roughly 33,000x smaller memory footprint than folding them into weights, and the layered design delivers 5.6x higher indirect-reasoning accuracy on average. The author states that writing a user's facts as a LoRA "folds content and skill into one global weight delta," which can contaminate unrelated text, whereas Engram edits leave the base model weights mathematically untouched.

The paper also compares retrieval scaling: because a per-user Engram table does not grow the population the retriever must search, it overtakes a retrieval pipeline on a 2.5x larger model after about ~100 facts. The Engram edits are also composable across users because different users' facts map to disjoint hash slots, while a single global weight delta such as per-user LoRA inherently admits only one composition.

Why it matters

User as Engram reframes personalization as a storage-layout problem inside the model rather than an external retrieval or global-weight problem. That matters because the paper backs the claim with concrete metrics: roughly 33,000x smaller memory footprint and a 5.6x boost in indirect-reasoning accuracy, plus a scaling crossover point at ~100 facts versus a retrieval pipeline on a 2.5x larger model. If those numbers hold in broader evaluations, teams building long-term personal memory, on-device personalization, or multi-user services could trade large external indices or per-user adapters for compact, composable in-model edits.

The paper further emphasizes auditability and failure modes: the edit is described as a glass box, so writes are precise and reversible at the layer level, and the approach explicitly never makes a single user worse at reasoning than the untouched base, according to the author.

What to watch

Look for open-source implementations, replication studies, or follow-up experiments that measure the 33,000x footprint claim and the 5.6x indirect-reasoning improvement across diverse model families and real-world tasks. Also watch whether Engram-style hash-slot composition holds under adversarial or dense overlapping user facts and how toolchains handle correct-layer writes versus miswrites.

High-level components of the User as Engram design