Raspberry Pi Launches AI HAT+ 2 With Hailo‑10H and 8GB RAM for On‑Device GenAI

Independent tests show mixed gains versus the Pi 5 CPU.

Overview

The $130 add‑on for Raspberry Pi 5 runs generative models locally using a Hailo‑10H NPU paired with 8GB of on‑board RAM to preserve privacy and reduce latency.
Raspberry Pi lists launch models including DeepSeek‑R1‑Distill, Llama 3.2, Qwen2.5‑Coder 1.5B, Qwen2.5‑Instruct 1.5B, and Qwen2 1.5B, with more and larger options planned.
The board delivers 40 TOPS (INT4) for LLM/VLM inference and offers computer‑vision performance broadly equivalent to the prior 26‑TOPS AI HAT, with tight integration into the Pi camera stack.
Early benchmarks from Jeff Geerling report the Pi 5 CPU often outpacing the Hailo‑10H on LLMs, with the NPU closer on Qwen2.5‑Coder 1.5B and operating under a 3W power envelope versus roughly 10W for the Pi SoC.
Hailo provides a hailo‑ollama backend and Dataflow Compiler with LoRA fine‑tuning support, though tests flagged outdated examples and errors when attempting concurrent vision and inference workloads.