
Topic hub
Model Compression
Covers methods to shrink and speed AI models, including pruning, quantization, distillation, MoE compression, and train and inference alignment.
74 briefs
Model Compression · Page 2
- Large Language Models: Small Initialization Improves ReasoningThe Brieftide

- Amazon SageMaker AI container caching: up to 2x faster scalingThe Brieftide

- AI Engram: geometric memory traces in deep networks, ICML oralThe Brieftide

- BioNeMo Recipes: LoRA fine-tunes ESM2-3B and Evo2-1B on RTX 6000The Brieftide

- Count Anything: Tsinghua model, CLOC dataset and benchmarksThe Brieftide

- Satya Nadella warns on token-maxing and maps developer futureThe Brieftide

- Microsoft SkillOpt boosts GPT-5.5 by about 23 points on tasksThe Brieftide

- DiffusionGemma: 4x faster text generation, 26B MoEThe Brieftide

- Cohere North Mini Code: 30B Mixture-of-Experts launchThe Brieftide

- Gemma 4 12B: unified, encoder-free multimodal model for laptopsThe Brieftide

- Microsoft Research Lens: detailed captions beat raw scaleThe Brieftide

- Anthropic and Stanford: why larger LLMs learn rare skillsThe Brieftide

- Sakana AI launches RSI Lab, outlines four-phase roadmapThe Brieftide

- DPO for OCR: cuts text degeneration by 59.4% on DharmaOCRThe Brieftide

- Mellum2 by JetBrains: 12B MoE model with 2.5B active paramsThe Brieftide

- Profiling in PyTorch: Beginner's Guide to torch.profilerThe Brieftide

- Nemotron-Labs Diffusion: 6.4× self-speculation speed on 8B modelsThe Brieftide

- DharmaOCR benchmark: 3B specialized model beats frontier APIsThe Brieftide

- OlmoEarth v1.1 release: Up to 3× cheaper satellite AIThe Brieftide

- Gemma 4 and new LLM designs: KV sharing, PLE, compressed attentionThe Brieftide

- GridSFM release: Microsoft's model solves AC‑OPF in millisecondsThe Brieftide

- Serverless GPUs add Opus 4.7 Fast and Qwen Image 2.0 supportThe Brieftide

- OpenAI Codex with GPT-5.5 used to ship production systemsThe Brieftide

- Parameter Golf challenge draws 1,000+ participants, 2,000+The Brieftide
