SkillMigrator: Transferable Interaction Patterns cut LLM actions
SkillMigrator reuses web skills by matching page layout; it cuts average LLM-action count by 8–10% on WebArena and Mind2Web.
TL;DR
- 01SkillMigrator reuses web skills by matching page layout; it cuts average LLM-action count by 8–10% on WebArena and Mind2Web.
- 02The paper contrasts this layout-driven retrieval with prior libraries that trigger skills mainly by instruction similarity or coarse site metadata.
- 03Those prior approaches produced low skill reuse on held-out sites, the authors write, because they relied on textual similarity or coarse site tags rather than page structure.
SkillMigrator, presented in an arXiv paper submitted 16 Jun 2026 by Shiqi He et al., learns reusable web skills and transfers them across sites by matching page layout structure rather than element-level references. The system stores each induced skill with a structural sketch as a transferable interaction pattern, retrieves TIPs by layout similarity at test time, grounds references on the live page, and cuts average LLM-action counts on successful trajectories by 8–10% across WebArena and Mind2Web at matched success rate.
How does SkillMigrator work?
SkillMigrator stores each induced skill paired with a structural sketch called a "transferable interaction pattern (TIP)" and retrieves those TIPs at test time by matching layout similarity, then grounds the pattern's references on the live page. After grounding, the agent invokes skills as callable tools on top of standard primitives; observations are accessibility snapshots with stable references and the system performs fixed tool calling over primitives plus skill invocations.
The paper contrasts this layout-driven retrieval with prior libraries that trigger skills mainly by instruction similarity or coarse site metadata. Those prior approaches produced low skill reuse on held-out sites, the authors write, because they relied on textual similarity or coarse site tags rather than page structure. SkillMigrator instead matches the page's layout to the sketch captured at induction time and repairs references to make the skill usable on a new page.
How much improvement does it deliver?
Compared with the state-of-the-art approaches, SkillMigrator reduces the average LLM-action count on successful trajectories by 8–10% across both WebArena and Mind2Web at matched success rate. The paper frames this reduction against a backdrop where, when every action is a low-level primitive, horizons grow quickly and LLM completions dominate latency and cost on benchmarks such as Mind2Web and WebArena.
The authors argue that grouping repeated interaction fragments into callable skills can replace several primitives with a single call, and that SkillMigrator's layout-based retrieval captures more of those repeated fragments across sites than previous trigger methods did. The reported 8–10% drop measures average LLM-action count on successful trajectories, and is stated as a comparison to state-of-the-art approaches while keeping success rate matched.
Why it matters
Low-level primitives force agents to emit long, policy-facing LLM completions and, the paper notes, cause latency and token costs to balloon on benchmarks like Mind2Web and WebArena. By learning skills that transfer across sites via layout similarity, SkillMigrator addresses the core reuse failure that prior libraries left unresolved: low reuse on held-out sites. Cutting 8–10% of LLM actions on successful runs directly targets the number of model calls and tokens generated, which are the primary drivers of latency and cost in tool-calling web agents.
What to watch
See whether TIP retrieval increases measurable skill reuse on held-out sites beyond the two tested benchmarks and whether the approach yields corresponding drops in token counts and end-to-end latency in full-system runs. Also watch for follow-up work or code and data releases from Shiqi He et al. that allow independent reproduction of the 8–10% action-count reduction.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Coding AgentsNVIDIA ENPIRE: AI coding agents teach robots GPU installs
ENPIRE let AI coding agents train robot arms to cut zip ties and insert GPUs.
CODA-BENCH benchmark: testing code agents on data tasks
CODA-BENCH places agents in a Kaggle-based Linux sandbox with 1,009 tasks across 31 communities and an average of 980 files per task.
SWE-Explore: benchmark shows AI coding agents miss key lines
SWE-Explore isolates code search from repair and finds agents hit the right files but cover only 14–19% of the lines that matter.
OpenAI acquires Ona to add persistent agents to Codex
The deal brings Ona's cloud development environments into Codex so agents can continue tasks for hours or days in customers' clouds.