Open Source AI4 min read

DRL-CLBA: Clean-Label Backdoor Attack for Speech via DDPG

A DDPG-based clean-label backdoor that embeds sample-specific triggers via audio steganography.

The Brieftide

TL;DR

  • 01A DDPG-based clean-label backdoor that embeds sample-specific triggers via audio steganography.
  • 02The paper, by Yueming Huang, Wenhan Yao, Fen Xiao, Xiarun Chen and Weiping Wen, was submitted to arXiv on 2 Jul 2026 as arXiv:2607.01729.
  • 03The steganography creates feature-space anchors; the DDPG agent optimizes target samples so they can be poisoned without changing labels, enabling label-migration-free poisoning of target samples.

DRL-CLBA, a clean-label backdoor attack for speech classification, uses Deep Deterministic Policy Gradient reinforcement learning and deep audio steganography to embed sample-specific triggers and create feature-space anchors. The paper, by Yueming Huang, Wenhan Yao, Fen Xiao, Xiarun Chen and Weiping Wen, was submitted to arXiv on 2 Jul 2026 as arXiv:2607.01729.

What is DRL-CLBA and how does it work?

DRL-CLBA is a clean-label backdoor attack that embeds sample-specific triggers into source audio via deep audio steganography, then uses a DDPG reinforcement learning framework to push target samples toward those trigger-bearing anchor points in a model's deep latent space. The steganography creates feature-space anchors; the DDPG agent optimizes target samples so they can be poisoned without changing labels, enabling label-migration-free poisoning of target samples.

The approach departs from many existing sample-specific attacks that rely on poisoned label attacks and are therefore easier to detect through manual data defenses. Instead DRL-CLBA places triggers invisibly inside audio examples and guides training-time poisoning through latent-space optimization. The paper describes the end-to-end pipeline: trigger embedding via audio steganography, anchor creation in feature space, and DDPG-driven optimization of target samples toward those anchors so the classifier learns the backdoor without explicit label changes.

How effective is DRL-CLBA and which defenses does it bypass?

Experiments reported in the paper show DRL-CLBA achieves a high attack success rate across three datasets and four different DNNs and demonstrates strong resistance to several standard defenses. The authors evaluate the attack across three datasets and four DNN architectures and find that the method effectively bypasses fine-tuning, pruning, and spectral signature defenses.

The paper highlights that the reinforcement learning framework successfully optimizes target samples toward trigger-bearing anchor points in the model's deep latent space, producing label-migration-free poisoned samples. That property is central to the attack's ability to evade defenses that rely on label inconsistencies or easily detectable perturbations. The authors emphasize the contrast with prior sample-specific attacks which commonly depend on poisoned label attacks and hence can be detected by manual inspection of labels.

Why it matters

DRL-CLBA exposes a concrete vulnerability in speech-controlled systems by combining two hard-to-detect techniques: audio steganography for imperceptible, sample-specific triggers and reinforcement learning to place poisoned samples in feature space without label changes. Systems that accept large-scale or crowd-sourced audio data for training are particularly exposed because clean labels remain intact while poisoned features are learned by the model.

The reported resistance to fine-tuning, pruning, and spectral signature defenses suggests that common hardening steps may not be sufficient when adversaries target a model's latent-space geometry rather than surface-level artifacts. The combination of DDPG optimization and feature-space anchors creates a stealthy attack vector that defenders must explicitly consider.

What to watch

Watch the paper's arXiv page for accompanying code, data and demos linked from the submission, and for follow-up work evaluating defenses specifically tailored to latent-space, clean-label poisoning. The submission is available as arXiv:2607.01729, submitted on 2 Jul 2026, and the authors list is Yueming Huang, Wenhan Yao, Fen Xiao, Xiarun Chen and Weiping Wen.

DRL-CLBA system components and data flow
Source audioDeep audio steganography (embeds sample-specific triggers)Trigger-bearing anchor points (feature-space anchors)DDPG reinforcement learning agentTarget samples (optimized toward anchors)Training set with poisoned samplesTrained speech classifier (at inference: backdoor misclassification)
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement