Overview
- TRM, a 7-million-parameter system, is reported to beat much larger models on strict reasoning tasks such as Sudoku-Extreme and ARC-AGI.
- Published results cite 87% on Sudoku-Extreme, about 44.6–45% on ARC-AGI-1, and 7.8% on ARC-AGI-2, where most large models reportedly score under 5%.
- Coverage says TRM outperforms contenders including Google’s Gemini 2.5 Pro, DeepSeek R1, and OpenAI’s o3-mini on ARC-AGI benchmarks.
- The model uses a two-layer network in a recursive draft–critique–revise loop, repeating up to 16 iterations to self-correct and converge on solutions.
- According to reports, TRM simplifies the earlier HRM approach by using a single network with standard backpropagation, and the work has been released with code on GitHub alongside an arXiv paper; the model is small enough to run on a laptop.