
Topic hub
Model Compression
Covers methods to shrink and speed AI models, including pruning, quantization, distillation, MoE compression, and train and inference alignment.
77 briefs
Model Compression · Page 3
- Serverless GPUs add Opus 4.7 Fast and Qwen Image 2.0 supportThe Brieftide

- OpenAI Codex with GPT-5.5 used to ship production systemsThe Brieftide

- Parameter Golf challenge draws 1,000+ participants, 2,000+The Brieftide

- MIT WRING: debiasing vision models without bias amplificationThe Brieftide

- Privacy-preserving on-device AI: MIT method enables trainingThe Brieftide

- EnergAIzer by MIT: fast AI power estimates in secondsThe Brieftide

- Transformers.js for Chrome extensions: how to run modelsThe Brieftide

- Hugging Face launches transformers-to-mlx converter for modelsThe Brieftide

- MIT pruning method trims models during training, reduces computeThe Brieftide

- mRNA language models: OpenMed trains 25-species models for $165The Brieftide

- OpenClaw: Hugging Face releases open-source LLM fork v1.0The Brieftide

- Hugging Face and NVIDIA: build domain embeddings in a dayThe Brieftide

- Ulysses Sequence Parallelism: Million-Token TrainingThe Brieftide

- DeepMind Nano Banana 2 release: image model at Flash speedThe Brieftide

- Mixture of Experts (MoE) in Transformers: Hugging Face guideThe Brieftide

- Information-driven imaging: Berkeley releases noisy-data estimatorThe Brieftide

- AlphaFold DeepMind: engineering heat-tolerant Rubisco for cropsThe Brieftide

- DeepSeek V3 to V3.2: architecture, sparse attention, RL updatesThe Brieftide

- Nano Banana Pro: DeepMind's Gemini 3 Pro Image model launchThe Brieftide

- DeepSeek V3: 671B model, MLA and MoE architectural choices, 2025The Brieftide

- PEVA whole-body egocentric video prediction with 16s rolloutsThe Brieftide

- SecAlign and StruQ: Berkeley AI defenses cut prompt-injectionThe Brieftide
