AI Safety4 min read

Governing Actions, Not Agents: Institutional Attestation Model

Jakob Salfeld-Nebgen formalises a governance model where agents plan but execution of high-risk acts requires independent.

The Brieftide

TL;DR

  • 01Jakob Salfeld-Nebgen formalises a governance model where agents plan but execution of high-risk acts requires independent.
  • 02The paper shows an agent can retain autonomy over planning and reasoning while losing execution authority for consequential acts such as clinical prescribing or production software deployment.
  • 03The paper includes a proof-of-concept implementation and concrete examples from software deployment and clinical prescribing to illustrate the pattern.

Governing Actions, Not Agents: Institutional Attestation as a Governance Model for Autonomous AI Systems, submitted to arXiv on 24 June 2026 by Jakob Salfeld-Nebgen (arXiv:2606.26298), formalises a way to stop autonomous agents from executing designated high-risk actions without independent checks. The paper shows an agent can retain autonomy over planning and reasoning while losing execution authority for consequential acts such as clinical prescribing or production software deployment.

What is institutional attestation?

Institutional attestation is a governance pattern in which execution of a designated high-risk action requires independently issued evidence at the point of action, rather than inspection of the agent's internal reasoning. The paper frames this as an institutional practice adapted to computational systems: each precondition for execution must be attested by a separate authoritative source, the attestations are cryptographically bound to a declared intent, and a deterministic policy evaluates them before any action is performed.

How does the governance model work in practice?

The model separates planning from execution: the agent plans and reasons freely but holds no execution authority over specified risky acts; execution is conditional on independently attested preconditions. Salfeld-Nebgen describes a workflow where an agent declares intent, authoritative sources each provide an attestation that is cryptographically bound to that intent, a deterministic policy evaluates those attestations, and decisions are recorded in a tamper-evident log that supports independent re-verification. The paper includes a proof-of-concept implementation and concrete examples from software deployment and clinical prescribing to illustrate the pattern.

Which components are required by the paper’s design?

The design requires five core elements: an autonomous agent that performs planning and reasoning, a list of designated high-risk actions for which the agent has no execution authority, separate authoritative attestation sources for each precondition, cryptographic binding of attestations to a declared intent, and a deterministic policy plus a tamper-evident log for recording outcomes. The author explicitly emphasises that the attestations must come from distinct authoritative sources and that the log must be amenable to independent re-verification.

Why it matters

The model shifts oversight from opaque post hoc audits of agent reasoning to point-of-action verification that mirrors long-standing institutional practices. That change matters because it lets systems preserve autonomous planning while making irreversible or consequential acts conditional on verifiable external checks, a structure the paper argues is better suited to high-risk domains like clinical prescribing and production deployments.

What to watch

Watch for implementations that pair this model with production systems for software deployment or clinical prescribing, which the paper uses as examples. The paper’s proof-of-concept will be a concrete next milestone to evaluate whether cryptographic bindings, separate authoritative attestations, and deterministic policy gates can be integrated into existing operational pipelines without introducing new single points of failure.

Additional details: the submission appears on arXiv as arXiv:2606.26298 (version v1) and carries the DOI https://doi.org/10.48550/arXiv.2606.26298. The paper is filed under the subjects Artificial Intelligence (cs.AI) and Cryptography and Security (cs.CR).

Institutional attestation: components and flow
Autonomous agent (planning and reasoning)Declared intentAuthoritative attestations (separate sources)Cryptographic binding to declared intentDeterministic policy evaluatorExecution authority (only granted after checks)Tamper-evident log (independent re-verification)Designated high-risk actions (no direct execution)
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

More in AI Safety
Advertisement