Allen Institute for AI Launches Molmo: Open-Source Multimodal Models Rivaling Tech Giants
Molmo models outperform proprietary systems like GPT-4o and Claude 3.5 Sonnet on several benchmarks using significantly less data.
- Molmo models are entirely open-source, providing accessible alternatives to proprietary vision-language models.
- The models utilize a high-quality dataset, PixMo, with detailed image captions created by human annotators.
- Molmo-72B, the most advanced model, outperformed leading proprietary models on 11 academic benchmarks.
- Innovative training techniques allow Molmo to use 1000 times less data than competitors while maintaining high performance.
- The release aims to foster open research and innovation by making all model weights, datasets, and source code publicly available.