Datagrom AI News Logo

DeepSeek open-sources new AI model with 671B parameters

DeepSeek open-sources new AI model with 671B parameters

December 26, 2024: DeepSeek Unveils 671B Parameter AI Model - DeepSeek has open-sourced DeepSeek-V3, a new AI model with 671 billion parameters using a mixture of experts (MoE) architecture. This design reduces hardware needs by activating only the necessary neural network from its multiple 34-billion parameter networks. DeepSeek-V3 outperforms other leading LLMs, achieving top scores in coding, math, and text processing benchmarks.

The model incorporates innovations like multihead latent attention and multitoken prediction for improved performance. Despite challenges in MoE training, DeepSeek developed solutions to ensure consistent output quality. The model is available now on Hugging Face.

KEEP UP WITH THE INNOVATIVE AI TECH TRANSFORMING BUSINESS

Datagrom keeps business leaders up-to-date on the latest AI innovations, automation advances,
policy shifts, and more, so they can make informed decisions about AI tech.