In artificial intelligence, Measuring Massive Multitask Language Understanding is a benchmark for evaluating the capabilities of large language models.