Google just shipped Quantization-Aware Training (QAT) checkpoints for the entire Gemma 4 family, and this is one of the most practically useful open-weights releases of the year. The goal: run capable AI models on the hardware you already own, without the quality penalty that normally comes with aggressive compression.

The new checkpoints are optimized with QAT to make Gemma 4 even more efficient, enabling local inference on everyday edge devices and consumer GPUs, by simulating quantization during training to minimize quality loss when the model is compressed. The result is a lineup of models that spans from a sub-1GB phone-ready variant all the way to a 31B dense model that now fits on a laptop.

Why QAT beats the usual approach

To understand why this matters, you need to know the difference between two compression strategies. The standard approach, Post-Training Quantization (PTQ), takes a fully trained model and converts its weights to a lower-precision format after the fact. It is fast and cheap, but the model was never designed for that precision.

PTQ takes a fully trained model and shoves its weights into a lower-precision format after the fact. The result is smaller and faster, but accuracy often degrades because the model never learned to compensate for the quantization noise. QAT flips this entirely: QAT simulates low-precision math during the forward pass, so the model learns to compensate for quantization error while it is still training.

The payoff is significant. QAT delivers roughly 40-50% lower memory usage, and quality remains within a few percentage points of the FP16 reference on standard benchmarks like MMLU and GSM8K. Critically, accuracy on INT4 QAT is significantly higher than INT4 PTQ, despite similar resource usage.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves