GLARE: Natural-Language Queries for Global Explanations
An LLM translates natural-language questions into SQL over local explanation data, returning statistics-augmented answers.
TL;DR
- 01An LLM translates natural-language questions into SQL over local explanation data, returning statistics-augmented answers.
- 02GLARE is an LLM-based interactive interface that gives natural-language access to global explanations for black-box image classifiers, the authors submitted to arXiv on 18 Jun 2026.
- 03The paper, arXiv:2606.19735, by Bhavan Vasu and Rajesh Mangannavar, is 16 pages long and includes 2 figures.
GLARE is an LLM-based interactive interface that gives natural-language access to global explanations for black-box image classifiers, the authors submitted to arXiv on 18 Jun 2026. The paper, arXiv:2606.19735, by Bhavan Vasu and Rajesh Mangannavar, is 16 pages long and includes 2 figures.
What is GLARE?
GLARE is a natural language interface that lets users pose targeted questions about global explanations and receive aggregated, explanation-aware answers. The system treats users as asking for specific, intent-driven information rather than handing over static explanation artifacts, and it mediates those questions through a core large language model.
The paper frames the problem around the complexity and monolithic nature of global explanations, arguing that users typically seek targeted answers. GLARE targets global explanations for black-box image classifiers and structures interaction so that natural-language queries become structured, queryable objects.
How does GLARE work?
GLARE's core LLM translates natural language questions into structured SQL queries over local explanation data, enabling flexible aggregation without exposing low-level representations. The paper describes the LLM as a mediator that maps user intent to query logic, runs those queries over stored local explanations, and returns results augmented with statistics.
For each query the interface produces statistics-augmented natural language responses, and it supports linking back to local explanations as well as producing intent-aligned visualizations. The workflow the authors describe starts from a natural language question, proceeds through intent interpretation and SQL mapping, executes aggregation across local explanation records, and then emits combined outputs: textual summaries, references to local explanations, and visualizations aligned to the user's intent.
The authors evaluate the system along several axes: intent interpretation, query mapping accuracy, generalization to novel queries and datasets, and robustness to linguistic errors. The paper reports that the evaluation shows improved accessibility and usability; in the authors' words, "LLM-mediated querying substantially improves the accessibility and usability of global explanations for human-centered XAI."
Why it matters
Global explanations are often large and hard to explore. GLARE changes the interaction model from browsing static artifacts to asking focused questions in natural language, which can lower the barrier for practitioners and non-experts to gain actionable insights about model behaviour. The approach isolates two friction points: translating human intent into formal queries, and aggregating local explanations into concise summaries. By placing an LLM between user and explanation store, GLARE addresses both.
Practically, that matters to teams auditing models, analysts investigating class- or dataset-level failure modes, and designers who need targeted explanation slices rather than full monolithic visualizations. The paper positions GLARE as a human-centered tool for XAI that emphasizes targeted answers and intent-aligned visuals over one-size-fits-all explanation artifacts.
What to watch
Look for the paper's linked code, data, and demos in the paper's "Code, Data and Media Associated with this Article" section on arXiv, and for follow-up work that publishes quantitative metrics for query mapping accuracy and generalization across more datasets. The authors evaluated robustness to linguistic errors and generalization to novel queries and datasets; publishing the underlying datasets and code would let others reproduce and stress-test those claims.
Paper details: arXiv:2606.19735, submitted 18 Jun 2026, authors Bhavan Vasu and Rajesh Mangannavar, 16 pages, 2 figures.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Multimodal AIThinkDeception: Progressive RL framework for multimodal deception
ThinkDeception on arXiv uses MLLMs, a step-by-step multimodal Chain of Thought dataset and a four-tier progressive RL trainer for.
Visual-Seeker: visual-native multimodal search surpasses rivals
Zhengbo Zhang and 12 co-authors submitted Visual-Seeker on 13 Jun 2026.
Gemma 4 12B: unified, encoder-free multimodal model for laptops
Google DeepMind’s 12B model brings encoder-free vision and native audio to laptops, runs on 16GB memory and is released under Apache 2.0.
Hugging Face Spaces agents.md: chain image to 3D splats
An agent used two Hugging Face Spaces and their agents.md files to auto-generate images, reconstruct 3D Gaussian splats.