Overview
- The Anti-Defamation League on Jan. 28 released its first AI Index after more than 25,000 chats tested six widely used models across anti-Jewish, anti-Zionist, and extremist content from August to October 2025.
- Anthropic’s Claude led with an overall score of 80, followed by ChatGPT (57), DeepSeek (50), Gemini (49), Llama (31), and xAI’s Grok last at 21.
- ADL leaders said every model showed gaps, with particular weaknesses on anti-Zionist and extremist prompts, and they urged companies to strengthen detection and counter‑narrative responses.
- Grok demonstrated consistently weak performance, scoring zeros on several document‑summary tasks and showing almost complete failure in image analysis, with the report calling for fundamental improvements.
- Regulatory pressure intensified this week as the European Union opened a probe into Grok’s image features tied to sexual deepfakes and 35 U.S. state attorneys general, led by Pennsylvania’s Dave Sunday, pressed xAI to disable the image undressing tool.