Multimodal AI4 min read

ProvenanceGuard paper: source-aware verifier, block F1 0.802

ProvenanceGuard verifies claim-to-source attribution for MCP-based LLM agents.

The Brieftide

TL;DR

  • 01ProvenanceGuard verifies claim-to-source attribution for MCP-based LLM agents.
  • 02ProvenanceGuard, introduced on arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús, is a source-aware factuality verifier for agents that use the Model Context Protocol.
  • 03The system consumes MCP traces with stable tool IDs and source IDs, decomposes answers into atomic claims, and returns per-claim verdicts plus an answer-level allow or block decision.

ProvenanceGuard, introduced on arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús, is a source-aware factuality verifier for agents that use the Model Context Protocol. The system consumes MCP traces with stable tool IDs and source IDs, decomposes answers into atomic claims, and returns per-claim verdicts plus an answer-level allow or block decision.

How does ProvenanceGuard verify provenance?

ProvenanceGuard routes each atomic claim to source-specific evidence, checks support with natural language inference and a token-alignment proxy, compares the routed source with the answer's stated attribution, and emits per-claim allow/block decisions; blocked answers can be repaired with retrieval-augmented answer revision and then re-verified. The pipeline requires captured MCP traces containing stable tool IDs, source IDs, and raw outputs, and it explicitly targets cross-source conflation where a claim is supported somewhere but attributed to the wrong source.

ProvenanceGuard's steps are: decompose answers into atomic claims, route claims to candidate sources, verify claim support with NLI plus a token-alignment proxy, compare the verifier's routed source against the answer's stated attribution, and produce per-claim verdicts and an overall allow or block decision. When an answer is blocked the system can apply a retrieval-augmented revision and run verification again.

How did it perform on medical MCP traces?

On a 40-trace held-out split, ProvenanceGuard achieved block F1 0.802 and source accuracy 0.858 over 260 source-eligible claims, and on a harder multi-source benchmark it reached block F1 0.846 while source-plus-relation accuracy dropped to 0.229. The authors evaluated the system on 281 medical-domain MCP-agent traces; a 266-trace adjudicated subset produced 2,325 LLM-assisted claim labels split by trace, and 361 held-out labels were human-verified.

The paper highlights two performance patterns. First, ProvenanceGuard outperforms source-blind baselines that do not emit claim-to-source IDs on the held-out split, producing high block detection and source accuracy. Second, exact source ownership remains difficult when sources are semantically close: the multi-source benchmark shows strong block detection (F1 0.846) but weak source-plus-relation accuracy (0.229). Repair-and-reverify resolved all blocked answers in the full trace set, commonly by falling back to conservative revisions. In 50 controlled clinical conflation probes ProvenanceGuard detected all injected attribution swaps with no retained wrong attribution.

Why it matters

Cross-source conflation is a distinct factuality failure mode for tool-using LLM agents: a claim can be supported somewhere in the evidence pool while being attributed to the wrong tool or record. ProvenanceGuard demonstrates that verifying source attribution is an independent axis of factuality verification, not covered by pooled-evidence checks alone. In medical settings the difference matters because claims linked to the wrong source could mislead downstream decisions; the paper's medical-domain evaluation and the 50 controlled clinical probes show that provenance checks can catch attribution swaps that pooled verification would miss.

What to watch

Watch whether future work narrows the gap on source-plus-relation accuracy, which fell to 0.229 on the multi-source benchmark, and whether repair-and-reverify strategies generalize beyond the evaluated medical traces. Also track extensions that test ProvenanceGuard-style verification on other MCP tool mixes and on larger, more diverse trace collections.

Authors and submission details: the paper "ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents" was submitted to arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús. The evaluation numbers cited above come from the paper's reported experiments on 281 traces and the specified held-out splits and benchmarks.

Key evaluation metrics and dataset counts from ProvenanceGuard
Item
Block F140-trace held-out80
Source accuracy40-trace held-out (source-eligible claims)86
Block F1 (multi-source benchmark)multi-source85
Source-plus-relation accuracymulti-source23
Evaluated MCP tracesfull evaluation set281
Adjudicated subset claim labels266-trace adjudicated subset2325
Held-out human-verified labelsheld-out labels361
Clinical conflation probes detectedcontrolled probes50/50 detected
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement