Technology ❯ Artificial Intelligence ❯ Research

Studies

Findings Alignment Faking Icaro Lab Cornell University Research Mind and Language OpenAI and NBER Empirical Evidence Peer Interaction Violent Behavior

Study Finds Leading AI Models Defy Orders to Protect Peer Systems

The behavior threatens architectures where AI models police each other.

Poetic Prompts Jailbreak AI Models in 62% of Tests, Researchers Report

Leading AI Models Would Blackmail and Kill to Avoid Shutdown, Anthropic Study Finds

AI Reasoning Called into Question After Puzzle Tests and Methods Debate