Coding Agents5 min read

Dr-DCI: Dynamic Workspace Expansion scales Corpus Interaction

Dr-DCI treats retrieval as an agent action to expand a local workspace.

The Brieftide

TL;DR

  • 01Dr-DCI treats retrieval as an agent action to expand a local workspace.
  • 02Dr-DCI, presented in a paper submitted on 12 Jun 2026 by Yi Lu and nine coauthors, reframes Direct Corpus Interaction by making retrieval an agent-callable action that expands a local workspace.
  • 03The paper, arXiv:2606.14885, argues this lets agents avoid running slow, unstable full-corpus commands by operating over a focused, evolving set of documents.

Dr-DCI, presented in a paper submitted on 12 Jun 2026 by Yi Lu and nine coauthors, reframes Direct Corpus Interaction by making retrieval an agent-callable action that expands a local workspace. The paper, arXiv:2606.14885, argues this lets agents avoid running slow, unstable full-corpus commands by operating over a focused, evolving set of documents.

How Dr-DCI works

Direct Corpus Interaction, or DCI, exposes shell-executable corpus operations for flexible search, filtering, comparison, and verification. The paper notes that while retriever-mediated interfaces such as BM25 or ColBERT scale discovery, they limit agents to ranked results or bounded document views. Raw DCI exposes richer operations but the authors show full-corpus terminal commands become slow and unstable as corpora grow.

Dr-DCI, short for retriever-steered DCI, treats retrieval as an action an agent can call to pull relevant documents into a local workspace. The agent then runs DCI operations inside that workspace. The authors summarize the design as: "retrieval keeps exploration scalable, while DCI preserves the local operations needed for effective evidence resolution." The implementation emphasizes an evolving workspace rather than direct, repeated operations over the entire corpus.

The submission lists authors Yi Lu, Zhuofeng Li, Ping Nie, Haoxiang Zhang, Yuyu Zhang, Kai Zou, Wenhu Chen, Jimmy Lin, Dongfu Jiang, and Yu Zhang, and the manuscript runs 25 pages with 4 figures and 22 tables.

Results across benchmarks and scales

On the Browsecomp-Plus benchmark, DR-DCI reaches 71.2% accuracy. The paper further shows that a workspace-preserving context reset raises accuracy to 73.3%. Compared with raw DCI and ablated variants, DR-DCI improves accuracy by up to 8.3 percentage points while reducing tool usage, wall time, and estimated cost.

In corpus-scaling experiments the authors report DR-DCI remains effective from 100K to 10M documents. By contrast, raw DCI becomes unstable at larger corpus sizes and BM25 "performs substantially worse." DR-DCI also scales to a 20M-scale file-per-document Wiki-18 QA setting, achieving an average score of 63.0 across six benchmarks and outperforming retrieval-based and trained search-agent baselines.

Ablation analysis in the paper highlights two design elements as especially important: ranked previews and inter-document DCI. Those components are reported as key contributors to the framework's performance gains over ablated variants.

Why it matters

Agents that must resolve evidence across documents need both scale and the ability to manipulate material locally. DR-DCI acknowledges that trade-off and attempts to bridge it by making retrieval part of the agent's toolkit rather than the sole interface. If the reported gains hold in broader settings, agents can run fewer expensive corpus-wide operations, finish tasks faster, and verify cross-document constraints more reliably.

The scaling results matter because they show a path for agentic search to operate on millions of documents while retaining the flexible, programmatic operations that researchers value in DCI. The paper makes a concrete claim about performance and cost trade-offs, not just conceptual benefits.

What to watch

Look for public code, datasets, or replication experiments linked from the arXiv entry to validate the wall-time and cost claims. Also watch whether ranked previews and inter-document DCI, the two ablation-identified levers, generalize across other retrieval backends beyond BM25 and ColBERT.

Key DR-DCI results and comparisons (from the paper)
Item
Browsecomp-Plus accuracy71.2%Up to 8.3 pts lower than DR-DCI (per ablations)N/AN/A
With workspace-preserving reset73.3%N/AN/AN/A
Corpus scaling (effective range)100K to 10M documentsBecomes unstable at scalePerforms substantially worseN/A
20M-scale Wiki-18 QA average score (six benchmarks)63.0N/AN/AOutperformed by DR-DCI
Key ablation findingsRanked previews and inter-document DCI are keyN/AN/AN/A
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement