Coding AgentsJuly 2, 20264 min read

SkillSelect-Serve: Budget-Controlled, QoS-Aware Skill

Represents skills as structured services and optimizes bundles; experiments on 35,353 skills and 586 queries improve same-budget recall.

The BrieftideJuly 2, 2026

TL;DR

01Represents skills as structured services and optimizes bundles; experiments on 35,353 skills and 586 queries improve same-budget recall.
02SkillSelect-Serve, by Jingyuan Zheng and six coauthors, frames agent skill selection as Skill Service Recommendation and Composition and targets small LLM agents.
03The authors position the approach for small LLM agents and cast selection and composition as budget-controllable and QoS-aware decisions, replacing the common retrievable-document view of skills.

SkillSelect-Serve, by Jingyuan Zheng and six coauthors, frames agent skill selection as Skill Service Recommendation and Composition and targets small LLM agents. The paper was submitted to arXiv on 8 May 2026 (arXiv:2607.00011) and reports experiments over 35,353 skills and 586 task queries that improve same-budget bundle recall and mean utility versus fixed top-k retrieval baselines.

What is SkillSelect-Serve?

SkillSelect-Serve is a framework that models reusable agent skills as structured Skill Services with functional descriptions, dependencies, context cost, risk, and QoS-related attributes, and it treats selection as a bundle recommendation problem rather than a fixed top-k retrieval task. The authors position the approach for small LLM agents and cast selection and composition as budget-controllable and QoS-aware decisions, replacing the common retrievable-document view of skills.

The paper defines Skill Services as richer metadata objects and places selection inside an economic-style trade-off: coverage, redundancy, cost, and risk. That reframing underpins the experiments and the evaluation the authors present.

How does it work?

SkillSelect-Serve converts natural-language tasks into structured requirements with a local Micro-Agent Requirement Planner, retrieves candidate services from a shared discovery backbone, and applies dual-granularity utility modeling to pick bundles that meet budget and QoS constraints. The pipeline explicitly models skill-level marginal suitability and then calibrates utility at the bundle level to balance coverage, redundancy, cost, and risk.

In practice the system represents raw skills as structured Skill Services carrying functional descriptions, dependencies, context cost, risk, and QoS-related attributes. A Micro-Agent Requirement Planner translates a user task into structured service requirements. A shared discovery backbone finds candidate services in a large registry. The selection stage estimates marginal suitability for individual services and then adjusts bundle-level utilities to control for overlaps and budget constraints. The authors include implementation and evaluation details across five figures and six tables in the submission.

What did the experiments show?

Experiments on a collection of 35,353 skills and 586 task queries demonstrate that SkillSelect-Serve consistently improves same-budget bundle recall and mean utility compared with fixed top-k retrieval baselines. The paper reports these gains as the primary experimental result and uses the term "same-budget" to emphasize the controlled-cost comparisons against top-k lists.

The evaluation setup, as described in the submission, treats skill selection as a composition problem and measures both recall for bundles and a mean utility metric, with improvements observed across the tested queries and skill set.

Why it matters

SkillSelect-Serve shifts skill selection from document retrieval to service-aware bundle optimization, which matters because agent workflows increasingly compose many small, reusable capabilities. Modeling context cost, risk, dependencies, and QoS directly gives agents levers to respect budget and operational constraints that fixed top-k lists cannot express. For developers of small LLM agents, that makes it easier to control expense and redundancy while targeting functional coverage.

The approach also offers a practical path for large registries: treating skills as structured services creates metadata hooks that discovery backbones and requirement planners can exploit, enabling trade-offs between cost and end-to-end utility rather than raw similarity scores.

What to watch

Check for broader evaluations beyond the paper's 35,353-skill, 586-query experiments and for public code or datasets linked from the submission's code and data sections. Successive work should confirm whether bundle-level calibration consistently outperforms fixed top-k retrieval across different registries and real-world agent tasks.

References and submission details: "SkillSelect-Serve: Budget-Controllable and QoS-Aware Skill Service Recommendation and Composition for Small LLM Agents," Jingyuan Zheng et al., arXiv:2607.00011, submitted 8 May 2026. DOI: https://doi.org/10.48550/arXiv.2607.00011.

SkillSelect-Serve system components and flow

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

Agent4cs: Multi-agent code summarization, up to 38% gains

Agent4cs uses three cooperating agents to summarize large hierarchical codebases.

The BrieftideDAILY BRIEF

llm-coding-agent 0.1a0: GPT-5.5 coding agent and tools

Simon Willison published llm-coding-agent 0.1a0 on 2nd July 2026, a PyPI slop-alpha that exposes file.

The BrieftideDAILY BRIEF

Mnemosyne agentic transaction system: validation & repair

Mnemosyne implements Agentic Transaction Processing (ATP) to validate AI-generated actions under an executable constraint set C and repair.

The BrieftideDAILY BRIEF

Autoformalization: Agent Instructions to Policy-as-Code

A pipeline that uses an LLM generator-critic loop to turn prompts and policy text into Cedar policies, submitted 25 Jun 2026.