Datagrom AI News Logo

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

December 26, 2024: DeepSeek-V3 Outshines Llama and Qwen Models - Chinese AI startup DeepSeek has unveiled DeepSeek-V3, a 671B parameter open-source model that surpasses Llama-3.1 and Qwen. Employing a mixture-of-experts architecture, the model enhances parameter activation efficiency. Key innovations, such as auxiliary loss-free load-balancing and multi-token prediction, boost performance, resulting in top scores on Chinese and math-centric benchmarks.

Trained economically at $5.57 million, DeepSeek-V3 challenges closed-source models, indicating significant progress in bridging the gap between open and closed-source AI solutions. It is available on GitHub and as an API, providing enterprises with a competitive AI tool.

KEEP UP WITH THE INNOVATIVE AI TECH TRANSFORMING BUSINESS

Datagrom keeps business leaders up-to-date on the latest AI innovations, automation advances,
policy shifts, and more, so they can make informed decisions about AI tech.