Tomasz Tunguz Blog · 2026-04-05 · 60d

AI Model Compression: From Frontier to Mobile in 23 Months

Advanced AI models that required 1.8 trillion parameters two years ago now run on smartphones with only 4 billion parameters, a 450x compression achieved through better algorithms, talent concentration, and massive capital investment. Google's Gemma 4 E4B now matches GPT-4o performance on mobile devices, with further releases expected from DeepSeek, Qwen, Kimi, and Minimax in coming weeks. At current compression rates, frontier-level AI capabilities will run on consumer phones before they require upgrades.

6 metrics· Cited 0× in the knowledge base ·Open source ↗

Metrics in this report

Compressed Model Size

4 billionparameters

current

mobile-runnable version today

Model Parameter Compression Ratio

450xmultiplier

over 23 months

frontier model compression from server to mobile

Newsletter Readership

150,000+subscribers

minimum

Tomasz Tunguz newsletter subscribers

Original Model Size

1.8 trillionparameters

baseline

frontier model 23 months ago

Time to Laptop Compatibility

3-4 monthsmonths

median

from frontier model release to laptop execution

Time to Mobile Compatibility

23 monthsmonths

median

from frontier model release to smartphone execution