Meta Unveils Llama 3.2: Advanced Multimodal AI Models Now Available

The latest Llama 3.2 models from Meta offer enhanced text and vision capabilities, optimized for edge, mobile, and cloud applications.

Overview

Meta's Llama 3.2 release includes lightweight text models (1B and 3B) and vision models (11B and 90B), designed for diverse applications from summarization to image reasoning.
The models are available on multiple platforms including AWS, Google Cloud, and Hugging Face, providing broad accessibility for developers.
Llama 3.2's vision models can perform complex tasks such as document-level understanding and image captioning, surpassing other closed models in benchmarks.
The lightweight text models are optimized for edge and mobile devices, ensuring efficient performance for tasks like summarization and instruction following with a 128K token context length.
Meta's models are integrated with tools like UnslothAI for faster finetuning and inference, and come with new safety tools like Llama Guard Vision for content safety classification.