December 26, 2024:
DeepSeek-V3 Sets New AI Benchmark - Chinese AI startup DeepSeek has launched DeepSeek-V3, a 671B parameter open-source model featuring innovative architecture and efficient parameter usage. It surpasses leading open models like Meta's Llama 3.1 and Qwen and closely matches closed models from Anthropic and OpenAI. Key innovations include a load-balancing strategy and multi-token prediction, which enhance performance and reduce training costs to $5.57 million.
DeepSeek-V3 excels in Chinese and math-centric tasks, strengthening open-source AI's competitiveness against proprietary alternatives. The model is accessible via Hugging Face and an MIT-licensed GitHub repository, offering widespread availability and fostering collaboration in the AI community.