Multimodal AIJune 25, 20265 min read

PHANTOM dataset: 47 524 multimodal adversarial attacks for VLMs

Open-source PHANTOM bundles 47 524 pre-generated attacks across 10 high-level categories and 55 subcategories to test vision-language.

The BrieftideJune 25, 2026

TL;DR

01Open-source PHANTOM bundles 47 524 pre-generated attacks across 10 high-level categories and 55 subcategories to test vision-language.
02PHANTOM, a new open-source dataset introduced on arXiv on 23 Jun 2026, provides 47 524 pre-generated multimodal adversarial samples aimed at vision-language models.
03The paper, authored by Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda and Nicola Franco, posts the dataset as a consolidated resource for researchers and practitioners.

PHANTOM, a new open-source dataset introduced on arXiv on 23 Jun 2026, provides 47 524 pre-generated multimodal adversarial samples aimed at vision-language models. The paper, authored by Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda and Nicola Franco, posts the dataset as a consolidated resource for researchers and practitioners.

What is PHANTOM?

PHANTOM is a large-scale, curated collection of pre-generated adversarial attacks for vision-language models, designed to be diverse and practical for robustness and safety evaluations. The dataset consolidates and extends prior benchmarks from multiple established sources, producing 7 826 intents and introducing an additional category to broaden coverage.

The authors present PHANTOM as a complement to existing efforts, explicitly targeting accessibility: generating large numbers of attacks is expensive and complex, so providing a ready-made corpus aims to lower that barrier. The submission lists the dataset as open-source and notes a release location via a URL provided in the paper's comments.

How is the dataset structured and sourced?

PHANTOM covers 10 high-level categories and 55 subcategories of harmful intents, and contains 47 524 adversarial samples generated with state-of-the-art attack strategies from recent literature. The dataset consolidates prior benchmarks into a single collection that, according to the paper, results in 7 826 intents and adds another category to increase coverage.

The authors emphasize diversity and representativeness: attacks span multiple harmful intent types and subtypes so that evaluations can exercise a range of failure modes. The paper frames the dataset as practical for tasks such as evaluating model robustness, fine-tuning attack-generation models, and stress-testing defensive guardrails under varied adversarial conditions.

Why it matters

PHANTOM makes a costly, time-consuming research task repeatable and shareable by publishing a large set of pre-generated attacks. Providing 47 524 samples and a taxonomy spanning 10 categories and 55 subcategories lets teams run comparable evaluations without redoing expensive attack generation. That reduces friction for reproducibility and lets researchers focus on defenses, alignment, and benchmarking rather than raw attack production.

Consolidating 7 826 intents and adding a new category also helps standardize evaluation scopes across studies. When multiple groups use the same adversarial corpus, comparisons of robustness and guardrail effectiveness become more meaningful. The paper positions PHANTOM as an enabling tool for systematic stress-testing of vision-language systems.

What to watch

Check the dataset release linked in the paper’s comments and its accompanying materials for attack-generation scripts, labels, and provenance details; those will determine how easily PHANTOM can slot into existing evaluation pipelines. Future work that reuses PHANTOM for benchmarking or that reports defense performance on the same 47 524 samples will be the clearest signal of adoption.

PHANTOM appears on arXiv as arXiv:2606.24388 (submitted 23 Jun 2026). The authors listed are Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda and Nicola Franco. The paper frames the dataset as a community resource to enable more reproducible, comprehensive and comparable evaluations of vision-language model safety and robustness.

PHANTOM dataset overview

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

ReMMD: Multilingual Multi-Image Benchmark and Agent Release

ReMMD introduces ReMMDBench (500 samples, 2,756 images) and ReMMD-Agent; GPT-5.2 yields 41.80% accuracy and 39.12% macro-F1.

The BrieftideDAILY BRIEF

Amazon Nova embeddings beat Cohere for Vexcel aerial search

Amazon Nova Multimodal Embeddings, evaluated on Vexcel imagery via Amazon Bedrock.

The BrieftideDAILY BRIEF

LLMs: gpt-4o, gpt-4.1-mini and claude-sonnet-4.6 study

Analysis of 21,000 multi-turn conversations finds human-like behaviors vary by model and user and can be modulated by system prompts.

The BrieftideDAILY BRIEF

ThinkDeception: Progressive RL framework for multimodal deception

ThinkDeception on arXiv uses MLLMs, a step-by-step multimodal Chain of Thought dataset and a four-tier progressive RL trainer for.