Technology ❯ Artificial Intelligence

Benchmarking

Performance Metrics Performance Evaluation Performance Comparison Performance Testing Crowdsourcing Model Comparison ARC-AGI Model Performance LMArena AI Performance Metrics MLPerf RealWorldQA Benchmark Fairness Geekbench Coding Benchmarks Performance Improvement Diagnostic Tools Evaluation Metrics Coding Standards Model Evaluation Evaluation Methods Visual Reasoning NAVSIM Performance Analysis AI Model Performance Performance Measurement AI Hardware

xAI Rolls Out Grok 4.1 Free to All Users, Citing Faster, More Accurate Replies

xAI touts faster, more reliable answers with a threefold drop in hallucinations.

Samsung Unveils TRUEBench to Measure LLM Productivity in Real-World Tasks

Google Unveils Gemini 2.5 Pro Preview and Doubles Pro Subscription Queries