Last month, Microsoft introduced a lightweight AI model called Phi-3 Mini to help small businesses grow. In a recent development, the tech giant has introduced another AI model, the Phi-3 Vision. What’s special about it? Well, this AI multimodal model can read and analyse not only texts but also images. Interestingly, Microsoft has developed this AI model for smartphones. Read along to know more about it.
In a world full of AI, we still lack AI tools that can read and analyse images. Google’s Circle to Search is one such AI tool that helps people find whatever they want from a picture. However, Phi-3 Vision is more than that.
With the Phi-3 Vision, you can inquire about images or charts, and it will provide insightful responses. Although, it’s not a tool for generating images like DALL-E or Stable Diffusion, but it excels in analysing and understanding images.
If we go into its technicalities, the Phi-3 Vision is a 4.2 billion parameter model that compliments the Phi-3-mini, the smallest member of the Phi-3 family. It offers 3.8 billion parameters. Here’s the complete family: Phi-3-mini, Phi-3-vision, Phi-3-small (7 billion parameters), and Phi-3-medium (14 billion parameters).
Microsoft has been successful with this approach in the past too. I am talking about its Orca-Math model. According to reports, it has surpassed larger competitors in solving maths problems.
Also read: Microsoft’s new Phi-3-mini AI model will help companies with small budget
For now, Phi-3-vision is only available in preview mode while the other three Phi-3 variants (mini, small, and medium) can be accessed through Azure’s model library. If you want to try it, click here.
AI is growing significantly and tech companies are now even focusing on smaller models. This also marks a significant strategy change and these are pretty important too. On top of that, these smaller models need less processing power and memory. It makes them perfect for mobile devices and other resource-constrained environments.