Coding AgentsJune 16, 20265 min read

Dr-DCI: Dynamic Workspace Expansion scales Corpus Interaction

Dr-DCI treats retrieval as an agent action to expand a local workspace.

The BrieftideJune 16, 2026

TL;DR

01Dr-DCI treats retrieval as an agent action to expand a local workspace.
02Dr-DCI, presented in a paper submitted on 12 Jun 2026 by Yi Lu and nine coauthors, reframes Direct Corpus Interaction by making retrieval an agent-callable action that expands a local workspace.
03The paper, arXiv:2606.14885, argues this lets agents avoid running slow, unstable full-corpus commands by operating over a focused, evolving set of documents.

Dr-DCI, presented in a paper submitted on 12 Jun 2026 by Yi Lu and nine coauthors, reframes Direct Corpus Interaction by making retrieval an agent-callable action that expands a local workspace. The paper, arXiv:2606.14885, argues this lets agents avoid running slow, unstable full-corpus commands by operating over a focused, evolving set of documents.

How Dr-DCI works

Direct Corpus Interaction, or DCI, exposes shell-executable corpus operations for flexible search, filtering, comparison, and verification. The paper notes that while retriever-mediated interfaces such as BM25 or ColBERT scale discovery, they limit agents to ranked results or bounded document views. Raw DCI exposes richer operations but the authors show full-corpus terminal commands become slow and unstable as corpora grow.

Dr-DCI, short for retriever-steered DCI, treats retrieval as an action an agent can call to pull relevant documents into a local workspace. The agent then runs DCI operations inside that workspace. The authors summarize the design as: "retrieval keeps exploration scalable, while DCI preserves the local operations needed for effective evidence resolution." The implementation emphasizes an evolving workspace rather than direct, repeated operations over the entire corpus.

The submission lists authors Yi Lu, Zhuofeng Li, Ping Nie, Haoxiang Zhang, Yuyu Zhang, Kai Zou, Wenhu Chen, Jimmy Lin, Dongfu Jiang, and Yu Zhang, and the manuscript runs 25 pages with 4 figures and 22 tables.

Results across benchmarks and scales

On the Browsecomp-Plus benchmark, DR-DCI reaches 71.2% accuracy. The paper further shows that a workspace-preserving context reset raises accuracy to 73.3%. Compared with raw DCI and ablated variants, DR-DCI improves accuracy by up to 8.3 percentage points while reducing tool usage, wall time, and estimated cost.

In corpus-scaling experiments the authors report DR-DCI remains effective from 100K to 10M documents. By contrast, raw DCI becomes unstable at larger corpus sizes and BM25 "performs substantially worse." DR-DCI also scales to a 20M-scale file-per-document Wiki-18 QA setting, achieving an average score of 63.0 across six benchmarks and outperforming retrieval-based and trained search-agent baselines.

Ablation analysis in the paper highlights two design elements as especially important: ranked previews and inter-document DCI. Those components are reported as key contributors to the framework's performance gains over ablated variants.

Why it matters

Agents that must resolve evidence across documents need both scale and the ability to manipulate material locally. DR-DCI acknowledges that trade-off and attempts to bridge it by making retrieval part of the agent's toolkit rather than the sole interface. If the reported gains hold in broader settings, agents can run fewer expensive corpus-wide operations, finish tasks faster, and verify cross-document constraints more reliably.

The scaling results matter because they show a path for agentic search to operate on millions of documents while retaining the flexible, programmatic operations that researchers value in DCI. The paper makes a concrete claim about performance and cost trade-offs, not just conceptual benefits.

What to watch

Look for public code, datasets, or replication experiments linked from the arXiv entry to validate the wall-time and cost claims. Also watch whether ranked previews and inter-document DCI, the two ablation-identified levers, generalize across other retrieval backends beyond BM25 and ColBERT.

Key DR-DCI results and comparisons (from the paper)

Item
Browsecomp-Plus accuracy	71.2%	Up to 8.3 pts lower than DR-DCI (per ablations)	N/A	N/A
With workspace-preserving reset	73.3%	N/A	N/A	N/A
Corpus scaling (effective range)	100K to 10M documents	Becomes unstable at scale	Performs substantially worse	N/A
20M-scale Wiki-18 QA average score (six benchmarks)	63.0	N/A	N/A	Outperformed by DR-DCI
Key ablation findings	Ranked previews and inter-document DCI are key	N/A	N/A	N/A

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

CODA-BENCH benchmark: testing code agents on data tasks

CODA-BENCH places agents in a Kaggle-based Linux sandbox with 1,009 tasks across 31 communities and an average of 980 files per task.

The BrieftideDAILY BRIEF

SWE-Explore: benchmark shows AI coding agents miss key lines

SWE-Explore isolates code search from repair and finds agents hit the right files but cover only 14–19% of the lines that matter.

The BrieftideDAILY BRIEF

OpenAI acquires Ona to add persistent agents to Codex

The deal brings Ona's cloud development environments into Codex so agents can continue tasks for hours or days in customers' clouds.

The BrieftideDAILY BRIEF

OpenAI Academy launches three courses for practical AI work

Three new Academy courses teach practical AI skills, repeatable workflows, and how to apply agents in everyday work.