SkillSelect-Serve: Budget-Controlled, QoS-Aware Skill
Represents skills as structured services and optimizes bundles; experiments on 35,353 skills and 586 queries improve same-budget recall.
TL;DR
- 01Represents skills as structured services and optimizes bundles; experiments on 35,353 skills and 586 queries improve same-budget recall.
- 02SkillSelect-Serve, by Jingyuan Zheng and six coauthors, frames agent skill selection as Skill Service Recommendation and Composition and targets small LLM agents.
- 03The authors position the approach for small LLM agents and cast selection and composition as budget-controllable and QoS-aware decisions, replacing the common retrievable-document view of skills.
SkillSelect-Serve, by Jingyuan Zheng and six coauthors, frames agent skill selection as Skill Service Recommendation and Composition and targets small LLM agents. The paper was submitted to arXiv on 8 May 2026 (arXiv:2607.00011) and reports experiments over 35,353 skills and 586 task queries that improve same-budget bundle recall and mean utility versus fixed top-k retrieval baselines.
What is SkillSelect-Serve?
SkillSelect-Serve is a framework that models reusable agent skills as structured Skill Services with functional descriptions, dependencies, context cost, risk, and QoS-related attributes, and it treats selection as a bundle recommendation problem rather than a fixed top-k retrieval task. The authors position the approach for small LLM agents and cast selection and composition as budget-controllable and QoS-aware decisions, replacing the common retrievable-document view of skills.
The paper defines Skill Services as richer metadata objects and places selection inside an economic-style trade-off: coverage, redundancy, cost, and risk. That reframing underpins the experiments and the evaluation the authors present.
How does it work?
SkillSelect-Serve converts natural-language tasks into structured requirements with a local Micro-Agent Requirement Planner, retrieves candidate services from a shared discovery backbone, and applies dual-granularity utility modeling to pick bundles that meet budget and QoS constraints. The pipeline explicitly models skill-level marginal suitability and then calibrates utility at the bundle level to balance coverage, redundancy, cost, and risk.
In practice the system represents raw skills as structured Skill Services carrying functional descriptions, dependencies, context cost, risk, and QoS-related attributes. A Micro-Agent Requirement Planner translates a user task into structured service requirements. A shared discovery backbone finds candidate services in a large registry. The selection stage estimates marginal suitability for individual services and then adjusts bundle-level utilities to control for overlaps and budget constraints. The authors include implementation and evaluation details across five figures and six tables in the submission.
What did the experiments show?
Experiments on a collection of 35,353 skills and 586 task queries demonstrate that SkillSelect-Serve consistently improves same-budget bundle recall and mean utility compared with fixed top-k retrieval baselines. The paper reports these gains as the primary experimental result and uses the term "same-budget" to emphasize the controlled-cost comparisons against top-k lists.
The evaluation setup, as described in the submission, treats skill selection as a composition problem and measures both recall for bundles and a mean utility metric, with improvements observed across the tested queries and skill set.
Why it matters
SkillSelect-Serve shifts skill selection from document retrieval to service-aware bundle optimization, which matters because agent workflows increasingly compose many small, reusable capabilities. Modeling context cost, risk, dependencies, and QoS directly gives agents levers to respect budget and operational constraints that fixed top-k lists cannot express. For developers of small LLM agents, that makes it easier to control expense and redundancy while targeting functional coverage.
The approach also offers a practical path for large registries: treating skills as structured services creates metadata hooks that discovery backbones and requirement planners can exploit, enabling trade-offs between cost and end-to-end utility rather than raw similarity scores.
What to watch
Check for broader evaluations beyond the paper's 35,353-skill, 586-query experiments and for public code or datasets linked from the submission's code and data sections. Successive work should confirm whether bundle-level calibration consistently outperforms fixed top-k retrieval across different registries and real-world agent tasks.
References and submission details: "SkillSelect-Serve: Budget-Controllable and QoS-Aware Skill Service Recommendation and Composition for Small LLM Agents," Jingyuan Zheng et al., arXiv:2607.00011, submitted 8 May 2026. DOI: https://doi.org/10.48550/arXiv.2607.00011.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Coding AgentsAgent4cs: Multi-agent code summarization, up to 38% gains
Agent4cs uses three cooperating agents to summarize large hierarchical codebases.
llm-coding-agent 0.1a0: GPT-5.5 coding agent and tools
Simon Willison published llm-coding-agent 0.1a0 on 2nd July 2026, a PyPI slop-alpha that exposes file.
Mnemosyne agentic transaction system: validation & repair
Mnemosyne implements Agentic Transaction Processing (ATP) to validate AI-generated actions under an executable constraint set C and repair.
Autoformalization: Agent Instructions to Policy-as-Code
A pipeline that uses an LLM generator-critic loop to turn prompts and policy text into Cedar policies, submitted 25 Jun 2026.