December 26, 2024:
DeepSeek-V3 Dominates Open-Source AI Landscape - Chinese AI startup DeepSeek unveils DeepSeek-V3, an ultra-large open-source model with 671B parameters using a mixture-of-experts architecture. It outperformed leading models like Meta's Llama-3.1 and closely matched closed models by Anthropic and OpenAI. Notable features include auxiliary loss-free load balancing and multi-token prediction, boosting training efficiency and speed.
Trained economically at $5.57 million, DeepSeek-V3 excels in Chinese and math benchmarks, marking a significant step forward for open-source AI's competitive edge against closed-source counterparts.