Technology ❯Artificial Intelligence ❯Benchmarking
Performance Evaluation Performance Metrics
The o3 system scored 88.5% on the ARC-AGI benchmark, raising questions about its implications for artificial general intelligence.