Microsoft has officially unveiled the Phi-3 Mini, the latest and smallest addition to its lineup of AI models. As part of a trio of new releases, the Phi-3 Mini kicks off the series with a compact framework featuring 3.8 billion parameters. This model is specifically designed for a smaller data set, in contrast to the expansive datasets used by larger models such as GPT-4. Available now on platforms like Azure, Hugging Face, and Ollama, the Phi-3 Mini is just the beginning, with the upcoming releases of Phi-3 Small and Phi-3 Medium boasting 7 billion and 14 billion parameters, respectively.
Parameters in this context are indicative of the model’s ability to process and understand complex instructions. Following the release of the Phi-2 in December, which matched the performance of larger models, Microsoft asserts that the new Phi-3 Mini surpasses its predecessor in efficiency and capability. This advancement means that the Phi-3 Mini can deliver responses with the level of sophistication expected from models ten times its size.
Tailored learning through innovative methods
Eric Boyd, the Corporate Vice President of Microsoft Azure AI Platform, explained to The Verge how the Phi-3 Mini manages to achieve such high performance. “It’s comparable to larger LLMs like GPT-3.5, just in a smaller form factor,” said Boyd. The development team employed a novel training approach they call a “curriculum,” inspired by the learning progression seen in children. This method involved using simplified text structures and vocabulary, akin to children’s literature, to effectively train the Phi-3 Mini on complex topics.
To supplement the limited availability of children’s books, the team created over 3,000 simplified “children’s books” using a larger language model. This innovative approach not only facilitated the training of the Phi-3 but also enhanced its capabilities in coding and reasoning, building upon the foundations laid by its predecessors.
The Phi-3 series, while knowledgeable in general topics, does not rival the comprehensive data processing capacity of a full-scale model like the GPT-4. However, Boyd highlights that for many companies, the smaller, more focused models like the Phi-3 Mini are more suitable for their specific applications. These models require less computational power, making them significantly more cost-effective, particularly for businesses working with smaller internal datasets.
In conclusion, Microsoft’s Phi-3 Mini represents a significant step forward in the development of AI models tailored for specific tasks and industries. By combining advanced capabilities with cost efficiency, Microsoft is paving the way for more accessible and versatile AI solutions.