Particle News: Microsoft Introduces Compact AI Models with Multimodal Capabilities

Overview

Microsoft's Phi-4-Multimodal and Phi-4-Mini models are designed to process text, images, and speech efficiently, with parameter sizes of 5.6 billion and 3.8 billion, respectively.
Phi-4-Multimodal integrates speech, vision, and text inputs using a novel 'mixture of LoRAs' technique, maintaining accuracy across modalities without performance degradation.
Phi-4-Mini excels in text-based tasks, outperforming larger models in benchmarks for math, coding, and reasoning, including an 88.6% score on the GSM-8K math benchmark.
Both models are optimized for deployment on standard hardware and edge devices, offering cost savings, reduced latency, and enhanced data privacy.
The Phi-4 models are available through Azure AI Foundry, Hugging Face, and Nvidia API Catalog, enabling developers to create innovative applications across industries.