Topic hub

Model Compression

Covers methods to shrink and speed AI models, including pruning, quantization, distillation, MoE compression, and train and inference alignment.

77 briefs

Model Compression · Page 3

  1. Serverless GPUs add Opus 4.7 Fast and Qwen Image 2.0 supportThe Brieftide
  2. OpenAI Codex with GPT-5.5 used to ship production systemsThe Brieftide
  3. Parameter Golf challenge draws 1,000+ participants, 2,000+The Brieftide
  4. MIT WRING: debiasing vision models without bias amplificationThe Brieftide
  5. Privacy-preserving on-device AI: MIT method enables trainingThe Brieftide
  6. EnergAIzer by MIT: fast AI power estimates in secondsThe Brieftide
  7. Transformers.js for Chrome extensions: how to run modelsThe Brieftide
  8. Hugging Face launches transformers-to-mlx converter for modelsThe Brieftide
  9. MIT pruning method trims models during training, reduces computeThe Brieftide
  10. mRNA language models: OpenMed trains 25-species models for $165The Brieftide
  11. OpenClaw: Hugging Face releases open-source LLM fork v1.0The Brieftide
  12. Hugging Face and NVIDIA: build domain embeddings in a dayThe Brieftide
  13. Ulysses Sequence Parallelism: Million-Token TrainingThe Brieftide
  14. DeepMind Nano Banana 2 release: image model at Flash speedThe Brieftide
  15. Mixture of Experts (MoE) in Transformers: Hugging Face guideThe Brieftide
  16. Information-driven imaging: Berkeley releases noisy-data estimatorThe Brieftide
  17. AlphaFold DeepMind: engineering heat-tolerant Rubisco for cropsThe Brieftide
  18. DeepSeek V3 to V3.2: architecture, sparse attention, RL updatesThe Brieftide
  19. Nano Banana Pro: DeepMind's Gemini 3 Pro Image model launchThe Brieftide
  20. DeepSeek V3: 671B model, MLA and MoE architectural choices, 2025The Brieftide
  21. PEVA whole-body egocentric video prediction with 16s rolloutsThe Brieftide
  22. SecAlign and StruQ: Berkeley AI defenses cut prompt-injectionThe Brieftide

Explore related topics