5 min read

SWE-Router: Routing multi-turn software engineering tasks

SWE-Router lets a cheap model explore for a few turns, conditions on the partial trajectory.

The Brieftide

TL;DR

  • 01SWE-Router lets a cheap model explore for a few turns, conditions on the partial trajectory.
  • 02The submission also includes a released multi-LLM trajectory dataset to allow reproduction of trajectory-level routing.
  • 03SWE-Router shifts decision-making onto the partial trajectory produced during exploration, making routing decisions conditional on observed intermediate outputs rather than only on the initial prompt.

SWE-Router, a routing method for multi-turn agentic software engineering workflows, was submitted to arXiv on 30 Jun 2026 as arXiv:2607.00053 by Seongho Son, Sangwoong Yoon, Jiahua Tang, Shuhan Wang, Lorenz Wolf and Ilija Bogunovic. The paper introduces a value-based temporal router that lets a cheap model run for a few exploratory turns, reads the resulting partial trajectory, and then decides whether to continue using the cheap model or to escalate to a stronger, more expensive model.

The authors state a formal result and practical evaluation: they present a Bayes-optimality theorem showing that "conditioning on the partial trajectory never harms routing" and is strictly better whenever exploration is informative. Empirical work evaluates SWE-Router across pairs of weak and strong LLMs spanning the contemporary cost--capability frontier, finding that the router greatly improves cost efficiency on software engineering tasks while retaining the majority of the stronger model's performance. The submission also includes a released multi-LLM trajectory dataset to allow reproduction of trajectory-level routing.

What is SWE-Router and how does it work?

SWE-Router is a value-based temporal routing scheme that first runs a cheap model for a small number of exploratory turns, observes the partial trajectory those turns generate, and then uses that partial trajectory to decide whether to escalate to a stronger model. In practice the system treats the interaction as a trajectory-level decision problem: it evaluates expected value after short, cheap exploration rather than routing off the static task description alone.

The paper contrasts this approach with existing routers that route based on the task description only, which the authors argue suffers from an information-theoretic Bayes-error floor in agentic, multi-turn settings. SWE-Router shifts decision-making onto the partial trajectory produced during exploration, making routing decisions conditional on observed intermediate outputs rather than only on the initial prompt.

How well does SWE-Router perform in experiments?

Across evaluated LLM pairs described as weak and strong models that span the contemporary cost--capability frontier, SWE-Router notably improves cost efficiency for software engineering tasks while maintaining the majority of the performance of the stronger model. The paper presents both a theoretical Bayes-optimality claim and empirical comparisons across those LLM pairs to substantiate the cost-performance trade-off.

The authors additionally release a multi-LLM trajectory dataset intended to reproduce trajectory-level routing experiments, enabling other researchers to validate cost-efficiency gains and routing behavior at the trajectory granularity the method relies on.

Why it matters

Routing based on partial trajectories addresses a concrete failure mode: identical task descriptions can mask very different underlying effort (for example, a trivial typo versus a multi-module refactor). By letting a cheap model perform a few exploratory turns and conditioning routing decisions on the resulting partial trajectory, SWE-Router reduces unnecessary escalation to expensive models while preserving performance when escalation is actually needed. The paper frames this shift both formally, with a Bayes-optimality theorem, and empirically, with cross-model experiments and a released dataset.

What to watch

Look for the released multi-LLM trajectory dataset and reproductions of the trajectory-level routing experiments; the authors included that dataset with the arXiv submission. Also watch for workshop or conference presentations tied to the 5th Deep Learning for Code Workshop, ICML 2026 where this work was listed as a comment in the submission metadata.

SWE-Router flow: exploration then conditional escalation
User task / bug reportCheap model (exploration)Partial trajectory (exploratory turns)Router decision (value-based)Strong model (escalation)Final output / fix
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

Browse the feed
Advertisement