Multimodal AIJune 17, 20264 min read

ProvenanceGuard paper: source-aware verifier, block F1 0.802

ProvenanceGuard verifies claim-to-source attribution for MCP-based LLM agents.

The BrieftideJune 17, 2026

TL;DR

01ProvenanceGuard verifies claim-to-source attribution for MCP-based LLM agents.
02ProvenanceGuard, introduced on arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús, is a source-aware factuality verifier for agents that use the Model Context Protocol.
03The system consumes MCP traces with stable tool IDs and source IDs, decomposes answers into atomic claims, and returns per-claim verdicts plus an answer-level allow or block decision.

ProvenanceGuard, introduced on arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús, is a source-aware factuality verifier for agents that use the Model Context Protocol. The system consumes MCP traces with stable tool IDs and source IDs, decomposes answers into atomic claims, and returns per-claim verdicts plus an answer-level allow or block decision.

How does ProvenanceGuard verify provenance?

ProvenanceGuard routes each atomic claim to source-specific evidence, checks support with natural language inference and a token-alignment proxy, compares the routed source with the answer's stated attribution, and emits per-claim allow/block decisions; blocked answers can be repaired with retrieval-augmented answer revision and then re-verified. The pipeline requires captured MCP traces containing stable tool IDs, source IDs, and raw outputs, and it explicitly targets cross-source conflation where a claim is supported somewhere but attributed to the wrong source.

ProvenanceGuard's steps are: decompose answers into atomic claims, route claims to candidate sources, verify claim support with NLI plus a token-alignment proxy, compare the verifier's routed source against the answer's stated attribution, and produce per-claim verdicts and an overall allow or block decision. When an answer is blocked the system can apply a retrieval-augmented revision and run verification again.

How did it perform on medical MCP traces?

On a 40-trace held-out split, ProvenanceGuard achieved block F1 0.802 and source accuracy 0.858 over 260 source-eligible claims, and on a harder multi-source benchmark it reached block F1 0.846 while source-plus-relation accuracy dropped to 0.229. The authors evaluated the system on 281 medical-domain MCP-agent traces; a 266-trace adjudicated subset produced 2,325 LLM-assisted claim labels split by trace, and 361 held-out labels were human-verified.

The paper highlights two performance patterns. First, ProvenanceGuard outperforms source-blind baselines that do not emit claim-to-source IDs on the held-out split, producing high block detection and source accuracy. Second, exact source ownership remains difficult when sources are semantically close: the multi-source benchmark shows strong block detection (F1 0.846) but weak source-plus-relation accuracy (0.229). Repair-and-reverify resolved all blocked answers in the full trace set, commonly by falling back to conservative revisions. In 50 controlled clinical conflation probes ProvenanceGuard detected all injected attribution swaps with no retained wrong attribution.

Why it matters

Cross-source conflation is a distinct factuality failure mode for tool-using LLM agents: a claim can be supported somewhere in the evidence pool while being attributed to the wrong tool or record. ProvenanceGuard demonstrates that verifying source attribution is an independent axis of factuality verification, not covered by pooled-evidence checks alone. In medical settings the difference matters because claims linked to the wrong source could mislead downstream decisions; the paper's medical-domain evaluation and the 50 controlled clinical probes show that provenance checks can catch attribution swaps that pooled verification would miss.

What to watch

Watch whether future work narrows the gap on source-plus-relation accuracy, which fell to 0.229 on the multi-source benchmark, and whether repair-and-reverify strategies generalize beyond the evaluated medical traces. Also track extensions that test ProvenanceGuard-style verification on other MCP tool mixes and on larger, more diverse trace collections.

Authors and submission details: the paper "ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents" was submitted to arXiv on 16 Jun 2026 by Ander Alvarez, Santhiya Rajan, Samuel Mugel and Román Orús. The evaluation numbers cited above come from the paper's reported experiments on 281 traces and the specified held-out splits and benchmarks.

Key evaluation metrics and dataset counts from ProvenanceGuard

Item
Block F1	40-trace held-out	80
Source accuracy	40-trace held-out (source-eligible claims)	86
Block F1 (multi-source benchmark)	multi-source	85
Source-plus-relation accuracy	multi-source	23
Evaluated MCP traces	full evaluation set	281
Adjudicated subset claim labels	266-trace adjudicated subset	2325
Held-out human-verified labels	held-out labels	361
Clinical conflation probes detected	controlled probes	50/50 detected

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

LLMs: gpt-4o, gpt-4.1-mini and claude-sonnet-4.6 study

Analysis of 21,000 multi-turn conversations finds human-like behaviors vary by model and user and can be modulated by system prompts.

The BrieftideDAILY BRIEF

ThinkDeception: Progressive RL framework for multimodal deception

ThinkDeception on arXiv uses MLLMs, a step-by-step multimodal Chain of Thought dataset and a four-tier progressive RL trainer for.

The BrieftideDAILY BRIEF

Visual-Seeker: visual-native multimodal search surpasses rivals

Zhengbo Zhang and 12 co-authors submitted Visual-Seeker on 13 Jun 2026.

The BrieftideDAILY BRIEF

Gemma 4 12B: unified, encoder-free multimodal model for laptops

Google DeepMind’s 12B model brings encoder-free vision and native audio to laptops, runs on 16GB memory and is released under Apache 2.0.