Datagrom AI News Logo

Microsoft releases Phi-4 language model trained mainly on synthetic data

Microsoft releases Phi-4 language model trained mainly on synthetic data

December 13, 2024: Microsofts Phi-4 Model Excels with Synthetic Training - Microsoft introduced Phi-4, a compact language model focused on math problem-solving, primarily trained on synthetic data. It features an advanced tokenizer and attention mechanism, handling up to 4,000 tokens. Phi-4's training included 50 synthetic datasets totaling 400 billion tokens from adapted web content and code snippets.

This novel training method enabled Phi-4 to surpass larger models like GPT-4o and Llama 3.3 in benchmarks such as GPQA and MATH, highlighting synthetic data's role in improving reasoning abilities. Phi-4 is accessible via Azure AI Foundry, with a planned release on Hugging Face.

KEEP UP WITH THE INNOVATIVE AI TECH TRANSFORMING BUSINESS

Datagrom keeps business leaders up-to-date on the latest AI innovations, automation advances,
policy shifts, and more, so they can make informed decisions about AI tech.