January 23, 2025:
Hugging Face Unveils Compact Vision Language Models - Hugging Face has open-sourced SmolVLM-256M, the smallest vision language model with 256 million parameters, enabling it to run on consumer laptops and potentially in web browsers with WebGPU support. This model excels in processing visual data, such as documents and videos, due to a new encoder with fewer parameters that enhances efficiency without sacrificing output quality.
Additionally, SmolVLM-500M, a more advanced version, delivers improved performance with 500 million parameters. Both SmolVLM-256M and SmolVLM-500M are available on Hugging Face's platform, demonstrating significant advancements in AI model efficiency and versatility.