Coding Agents5 min read

Ghost Attractor Networks: 2.3M decoder beats 1.07B Diffusion

Ghost Attractor Networks, a 2.3-million-parameter dynamical decoder, matches a 1.07-billion-parameter Diffusion Transformer with 462× fewer.

The Brieftide

TL;DR

  • 01Ghost Attractor Networks, a 2.3-million-parameter dynamical decoder, matches a 1.07-billion-parameter Diffusion Transformer with 462× fewer.
  • 02Ghost Attractor Networks, a dynamical decoder by Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang and Lihui Wang, was submitted to arXiv on 16 Jun 2026.
  • 03The paper proposes a learned potential-drift latent dynamics that builds a basin-attractor geometry for closed-loop sequential generation and tests it as a robotic action decoder.

Ghost Attractor Networks, a dynamical decoder by Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang and Lihui Wang, was submitted to arXiv on 16 Jun 2026. The paper proposes a learned potential-drift latent dynamics that builds a basin-attractor geometry for closed-loop sequential generation and tests it as a robotic action decoder.

How does Ghost Attractor Networks work?

Ghost Attractor Networks is a dynamical decoder whose latent evolves under a learned potential with drift to produce a basin-attractor structure by construction; mode transitions arise as saddle-node bifurcations with ghost-attractor escape, and a hierarchical phase-space decomposition separates first-order basin convergence from second-order proprioceptive refinement. The authors motivate the potential-drift form with three desiderata: multi-modality, decoder-level single-pass switching, and constant memory, and train Ghost end-to-end with a behavioral-cloning and contrastive objective.

How does it perform versus alternatives?

A 2.3-million-parameter Ghost matches the offline accuracy of a 1.07-billion-parameter Diffusion Transformer while using 462 times fewer parameters and delivering 32 times lower latency; on held-out latent dynamics, its gradient norm decayed by 67 percent across five integration steps on 1,430 samples. The paper reports that the 2.3M Ghost beats five alternative 2M-parameter decoders (MLP, Neural ODE, CVAE, Transformer, 1-step Diffusion) on offline mean squared error by 5.9 to 29 percent.

In closed-loop evaluation on the LIBERO-10 benchmark, phase conditioning on Ghost's basin-structured latent produced a 13.5 percentage-point success-rate gain over a feed-forward MLP baseline, and persistent-latent ensembling reached a 95.7 percent final success rate.

Why does the basin-structured latent matter?

A structured latent with stable basins gives the decoder explicit geometry for phase-conditioned action generation and for carrying information across steps without storing long histories. That geometry lets Ghost implement mode switching via dynamical bifurcations rather than by iterating large decoders or expanding memory, yielding the parameter and latency reductions reported. For robotics tasks that impose closed-loop control constraints, the paper’s results show a practical route to lower-cost decoders that retain or improve task accuracy.

What to watch

See whether Ghost's basin-structured approach replicates these gains beyond LIBERO-10 and across more diverse robotics tasks and deployment settings, and whether the gradient-flow contraction measured on 1,430 held-out samples sustains over longer horizons. The next confirming signals would be matched offline accuracy reported on additional benchmarks, or comparable closed-loop success rates when phase conditioning and persistent-latent ensembling are applied elsewhere.

References and provenance

The description, numbers and claims here are drawn from the arXiv submission "Ghost Attractor Networks: Basin-Structured Dynamical Decoders for Closed-Loop Sequential Generation," submitted 16 Jun 2026 by Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang and Lihui Wang (arXiv:2606.18315).

Performance and benchmark comparison
Item
Parameters2.3 million1.07 billionapproximately 2 million
Relative parameter factor462× fewer (vs Diffusion Transformer)baselinebaseline
Latency32× lower (vs Diffusion Transformer)baselinebaseline
Offline accuracy / MSEMatches Diffusion Transformer; 5.9%–29% lower MSE vs five 2M decodersmatches (baseline)baseline (MLP, Neural ODE, CVAE, Transformer, 1-step Diffusion)
Latent dynamics metricGradient norm decayed by 67% across five integration steps on 1,430 held-out samplesn/an/a
LIBERO-10 closed-loopPhase conditioning +13.5 percentage points vs MLP; persistent-latent ensembling 95.7% final success raten/aMLP baseline (for comparison)
Advertisement

Written by The Brieftide · Source: arXiv

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement