eepSeek may have had a hand in breaking another boundary in AI efficiency.
A paper co-authored by DeepSeek founder Liang Wenfeng and researchers from Peking University outlines a model-training technique that overcomes GPU memory constraints while achieving “aggressive parameter expansion.”
To put it simply, this technique can do a whole lot more with a whole lot less. Using a technique called “conditional memory,” this system can efficiently recall more basic information, rather than wasting memory and compute on “trivial operations.” In a test on a 27-billion-parameter model, the researchers found that this method surpassed several industry benchmarks and left room for complex reasoning tasks.
The breakthrough comes as DeepSeek readies to drop V4, its next-generation model that reportedly surpasses competitors in coding capabilities. But DeepSeek and other Chinese AI firms face several challenges:
- Export restrictions on advanced AI chips have strangled resources in China, forcing these firms to rely on less powerful homegrown hardware.
- And the AI industry broadly faces a massive shortage in high-bandwidth memory as prices for this chip component surge.
These challenges could crunch the market so thoroughly that some Chinese AI leaders are betting that the US will remain in the lead on frontier model innovation for the foreseeable future. Justin Lin, head of Alibaba’s Qwen models, said at a summit in Beijing on Saturday that current compute resources are “stretched thin — just meeting delivery demands consumes most of our resources.”
Still, China leads the market on open-weight AI development: Analysis from Stanford’s Institute for Human-Centered Artificial Intelligence found that downloads of Alibaba’s Qwen models eclipsed Meta’s Llama models on Hugging Face in 2025. And the ecosystem extends far beyond DeepSeek and Alibaba, with these open-weight models performing “at near-state-of-the-art levels across major benchmarks and leaderboards.”




