AI Model Compression: From Frontier to Mobile in 23 Months
Advanced AI models that required 1.8 trillion parameters two years ago now run on smartphones with only 4 billion parameters, a 450x compression achieved through better algorithms, talent concentration, and massive capital investment. Google's Gemma 4 E4B now matches GPT-4o performance on mobile devices, with further releases expected from DeepSeek, Qwen, Kimi, and Minimax in coming weeks. At current compression rates, frontier-level AI capabilities will run on consumer phones before they require upgrades.
Metrics in this report
4 billionparameters
current
mobile-runnable version today
450xmultiplier
over 23 months
frontier model compression from server to mobile
150,000+subscribers
minimum
Tomasz Tunguz newsletter subscribers
1.8 trillionparameters
baseline
frontier model 23 months ago
3-4 monthsmonths
median
from frontier model release to laptop execution
23 monthsmonths
median
from frontier model release to smartphone execution