DeepSeek Launches Open-Source V4 AI Model Series
Chinese AI developer DeepSeek has released its V4 open-source large language model series, featuring two models: the flagship V4-Pro with 1.6 trillion parameters and the lighter V4-Flash with 284 billion parameters. Both use a mixture-of-experts architecture and a new hybrid attention mechanism that reduces KV cache memory usage by 90% compared to previous DeepSeek models.
V4 introduces mHC, a feature allowing data to skip intermediate neural layers during training, reducing errors and improving output quality. Trained on 27 trillion tokens, V4-Pro outperformed competitors including Claude Opus 4.6 on three benchmarks. Both models are available in preview on Hugging Face.
