Coding AgentsJuly 3, 20265 min read

ContextSniper (AntTrail) token-efficient code memory, benchmarks

AntTrail's ContextSniper cuts token use by up to 51.5% on SWE-bench Lite while slightly lowering repair rates in tests with OpenClaw and.

The BrieftideJuly 3, 2026

TL;DR

01AntTrail's ContextSniper cuts token use by up to 51.5% on SWE-bench Lite while slightly lowering repair rates in tests with OpenClaw and.
02AntTrail's ContextSniper, submitted to arXiv on 2 July 2026, is a token-efficient code memory layer built for repository-level program repair.
03ContextSniper is AntTrail's coding-specialized memory layer that selects and returns compact evidence packets instead of dumping whole files and long logs into prompts.

AntTrail's ContextSniper, submitted to arXiv on 2 July 2026, is a token-efficient code memory layer built for repository-level program repair. In SWE-bench Lite tests using OpenClaw and Claude Code with 50 task runs per host-agent condition, ContextSniper reduced total token use by 51.5% for OpenClaw and 38.9% for Claude Code, and cut logged or estimated cost by 36.4% and 27.3% respectively, while submitted-resolution rates fell slightly.

What is ContextSniper and how does it work?

ContextSniper is AntTrail's coding-specialized memory layer that selects and returns compact evidence packets instead of dumping whole files and long logs into prompts. It implements the Sniper feature: retrieving candidate code and runtime evidence, ranking that evidence with hybrid retrieval signals, filtering long outputs through an intention-aware context gate, and returning compact packets while preserving recoverable source context outside the prompt.

The paper frames ContextSniper as a specialization of AntTrail's broader agent memory engine. The system aims to avoid wasting context budget on whole-file reads, broad searches and long terminal outputs where useful evidence is mixed with irrelevant code and logs. The authors describe a pipeline that narrows evidence to high-precision items and gates what is placed into the model prompt.

How much did the tests save and what did the experiments show?

On SWE-bench Lite, evaluated with OpenClaw and Claude Code across 50 task runs per host-agent condition, ContextSniper reduced total token use by 51.5% for OpenClaw and by 38.9% for Claude Code. Logged cost fell 36.4% for OpenClaw, and estimated cost declined 27.3% for Claude Code; submitted-resolution rates decreased from 26.0% to 24.0% for OpenClaw and from 32.0% to 30.0% for Claude Code.

The authors report these reductions while noting only a slight drop in submitted-resolution rates: OpenClaw moved from 26.0% to 24.0%, Claude Code from 32.0% to 30.0%. The paper evaluates each host-agent condition with 50 runs, and the ContextSniper implementation focuses on precision evidence selection to trade off less context for a modest change in repair success.

Why it matters

Reducing token use by tens of percent directly lowers the operational cost of running large language model agents across many repository repair tasks, and it reduces prompt bloat that can hide the actual evidence needed for repair. For teams that run repeated or large-scale repository-level repair workflows, a 51.5% token reduction for one agent (OpenClaw) or a 38.9% reduction for another (Claude Code) can materially cut billable tokens and speed response times.

The modest decline in submitted-resolution rates shows the trade-off: more aggressive filtering saves tokens and cost but can miss context that helps a model resolve an issue. ContextSniper makes that trade explicit by using hybrid retrieval signals and an intention-aware gate, letting operators decide whether token savings justify a slight fall in automated repair success.

What to watch

Check the project's open pilot scripts and follow-up experiments to see whether tuning the retrieval and gating parameters recovers the small drop in resolution rates while keeping most token savings. The paper's authors note that ContextSniper's pilot testing scripts are open-sourced at this https URL.

Authors and provenance: the paper "ContextSniper: AntTrail's Token-Efficient Code Memory for Repository-Level Program Repair" was submitted to arXiv on 2 July 2026 by Chiwang Luk and eight coauthors and evaluated on SWE-bench Lite with 50 task runs per host-agent condition.

ContextSniper experimental results (OpenClaw vs Claude Code)

Item
Total token use reduction	51.5% reduction	38.9% reduction
Cost reduction	36.4% logged cost reduction	27.3% estimated cost reduction
Submitted-resolution rate (before → after)	26.0% → 24.0%	32.0% → 30.0%
Task runs per host-agent condition	50 runs	50 runs

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

Agent4cs: Multi-agent code summarization, up to 38% gains

Agent4cs uses three cooperating agents to summarize large hierarchical codebases.

The BrieftideDAILY BRIEF

Autoformalization: Agent Instructions to Policy-as-Code

A pipeline that uses an LLM generator-critic loop to turn prompts and policy text into Cedar policies, submitted 25 Jun 2026.

The BrieftideDAILY BRIEF

Agentic Analysis: LLM Pipeline compares ERC-8004 and Google A2A

An LLM-powered pipeline analyzes 4,323 governance participation records across ERC-8004 (permissionless.

The BrieftideDAILY BRIEF

Data2Story: CSV-to-article pipeline with seven AI agents

A Claude Code skill runs seven specialist agents to turn a CSV into a verifiable, interactive news article with an Inspector panel.