SANTA CLARA,CA/USA – FEBRUARY 1, 2014: Microsoft corporate building in Santa Clara, California. Microsoft is a multinational corporation that develops, supports and sells computer software and services.

Microsoft has today officially unveiled Phi-4, the latest iteration in its Phi series of generative AI models. Phi-4 comes with 14 billion parameters, and is positioned as a small yet powerful model that is said to ‘excel’ in specialized tasks, particularly mathematical reasoning.

“We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality. Unlike most language models, where pre-training is based primarily on organic data sources such as web content or code, phi-4 strategically incorporates synthetic data throughout the training process. While previous models in the Phi family largely distill the capabilities of a teacher model (specifically GPT-4), phi-4 substantially surpasses its teacher model on STEM-focused QA capabilities, giving evidence that our data-generation and post-training techniques go beyond distillation. Despite minimal changes to the phi-3 architecture, phi-4 achieves strong performance relative to its size– especially on reasoning-focused benchmarks– due to improved data, training curriculum, and innovations in the post-training scheme,” the company noted in a technical report.

Currently, the model is available under a limited release, mostly for research purposes through the company’s Azure AI Foundry platform. It is touted to come with the ability to outperform much larger models, including Google’s Gemini Pro 1.5 and OpenAI’s GPT-4o, on tasks that require complex reasoning. This is evident in the model’s ability to solve mathematical problems, a feature that Microsoft has heavily emphasized in its rollout of Phi-4.

Currently, larger models like GPT-4 and Gemini Ultra are built with hundreds of billions, or even trillions, of parameters. Phi-4, on the other hand, aims to achieve the results with far fewer computational resources. Microsoft attributes Phi-4’s strong performance to the use of “high-quality synthetic datasets” alongside data from human-generated content, while maintaining lower computational costs.

Phi-4 was trained on synthetic datasets that were specifically crafted to provide diverse, structured problem-solving scenarios. These datasets were supplemented by high-quality human-generated content to ensure that the model encountered a wide range of real-world scenarios during training. as well. Techniques such as multi-agent prompting and instruction reversal, and Microsoft uses post-training processes, such as rejection sampling and Direct Preference Optimization (DPO) as well.

Once Phi-4 is made available to a wider user base, it could prove to be a boon for mid-sized companies and organizations with limited computing resources. By keeping costs significantly lower (as compared to large-scale AI models), Phi-4 can free up resources that can be directed toward other avenues. This could benefit enterprises that have hesitated to adopt AI solutions due to the high resource demands of larger models.