Google Open-Sources DiffusionGemma, 4x Faster LLM

NEW GENERATIVE AI LAUNCH June 11, 2026 siliconangle

Google has released DiffusionGemma, an open-source large language model using a text diffusion architecture that generates text four times faster than traditional LLMs while consuming less RAM. The model produces 256 tokens simultaneously rather than one at a time, achieving over 1,000 tokens per second on an Nvidia H100 GPU. Based on Google's Gemma 4 26B model, DiffusionGemma uses a mixture-of-experts architecture with 26 billion parameters but activates only 3.8 billion per query, enabling it to run on consumer-grade GPUs like the GeForce RTX 5090. The model is available on Hugging Face under an open-source license.

Read the original article →

Meta AI Comes to Threads Private Messages

NEW GENERATIVE AI LAUNCH Jul 27, 2026

Google Open-Sources DiffusionGemma, 4x Faster LLM

Related Articles

Meta AI Comes to Threads Private Messages

Microsoft Launches AI Cybersecurity Model, Targets Rivals

Microsoft Launches Cybersecurity AI at Half the Cost

Anthropic's Claude Opus 5 Beats Rivals at Half the Cost