Janus playground for agentic permission management, arXiv 2026
Janus provides Janus-Core and Janus-Harness for user-involved permission designs; code and data are publicly available.
TL;DR
- 01Janus provides Janus-Core and Janus-Harness for user-involved permission designs; code and data are publicly available.
- 02Janus is a new playground for designing and evaluating user-involved permission management for AI agents, published on arXiv as arXiv:2607.01510 and submitted 1 Jul 2026.
- 03The authors — Natalie Grace Brigham, Eugene Bagdasarian, Tadayoshi Kohno, and Franziska Roesner — release Janus as two components, Janus-Core and Janus-Harness, with code and data available on GitHub.
Janus is a new playground for designing and evaluating user-involved permission management for AI agents, published on arXiv as arXiv:2607.01510 and submitted 1 Jul 2026. The authors — Natalie Grace Brigham, Eugene Bagdasarian, Tadayoshi Kohno, and Franziska Roesner — release Janus as two components, Janus-Core and Janus-Harness, with code and data available on GitHub.
What is Janus and how does it work?
Janus is a modular system plus an automated evaluation framework: Janus-Core implements agentic permission management designs, and Janus-Harness runs automated evaluations. The paper describes Janus-Core as supporting a diverse spectrum of permission management designs and Janus-Harness as an automated evaluation framework that can exercise those designs across scenarios and synthetic responders.
Janus operationalizes a conceptual model that the authors use to identify key design axes for user involvement. The implementation includes six permission assistants that span the design space; the Harness evaluates those assistants across three scenarios and three synthetic responders. The paper’s repository hosts the code and data used in the experiments.
What did the authors evaluate and find?
The authors implemented six permission assistants and evaluated them across three scenarios and three synthetic responders, demonstrating trade-offs rather than a single best choice. Their experiments show that "user input is critical and can significantly strengthen privacy and security," that augmenting user decisions with AI can reduce cognitive load, and that realistic user behavior, including permission fatigue, must be accounted for in system design.
The evaluation is structured: the six assistants represent points on the design axes from the conceptual model, and Janus-Harness automates interactions with the synthetic responders to compare behavior. The primary empirical takeaways are comparative and qualitative: no single permission assistant performs optimally across every tested context, and user involvement plus AI augmentation each change outcomes in meaningful ways.
Why does this matter?
Janus addresses a growing gap as AI agents gain the ability to execute tool calls autonomously on users’ behalf: it makes the user's role explicit and testable. By providing both a modular implementation and an automated harness, Janus lets designers compare permission strategies under controlled, repeatable conditions. The project highlights that permission fatigue and realistic user behavior can materially affect privacy and security outcomes, which matters for any deployment that delegates actions to agents.
The paper’s release of code and data on GitHub supports reproducibility and follow-up work, enabling researchers and system designers to test alternative assistants or scenarios without rebuilding the evaluation pipeline from scratch.
What to watch
Watch for follow-up studies that reuse Janus to test additional permission-assistant designs or that replace the paper’s synthetic responders with human subjects or richer simulators. Also watch whether deployments of agentic systems adopt context-sensitive permission assistants, since the paper finds no single design succeeds across all scenarios.
Methods note
The submission is arXiv:2607.01510, submitted 1 Jul 2026. The authors present Janus as two components (Janus-Core and Janus-Harness), implement six permission assistants, and evaluate them across three scenarios and three synthetic responders. The paper and its associated code and data are publicly available on GitHub.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in AI SafetyAgentic Analysis: LLM Pipeline compares ERC-8004 and Google A2A
An LLM-powered pipeline analyzes 4,323 governance participation records across ERC-8004 (permissionless.
Anthropic's Power Play: Leading AI Now to Make It Safer
Anthropic says building dominant AI models and accumulating influence are necessary to steer the technology away from catastrophic risks.
Human-centric AI and firm idiosyncratic risks, 2015–2023
Human-centric AI strategies are associated with lower firm idiosyncratic risk among Chinese listed firms.
OpenAI joins Appia Foundation to build shared AI standards
OpenAI supports evaluation frameworks, safety practices and global cooperation through the Appia Foundation.