Humanity's Last Exam is a language model benchmark encompassing 3000 unambiguous and easily verifiable academic questions about mathematics, humanities, and the natural sciences contributed by almost 1000 subject-experts from over 500 institutions across 50 countries, providing expert-level human performance on closed-ended academic questions. From Wikipedia