Science ❯ Computer Science ❯ Artificial Intelligence

Ethics in AI

Bias in AI AI Limitations AI Safety Fairness in AI Misinformation Management Behavioral Control Human Impact Military Applications Model Behavior Motivated Reasoning Safety in AI Models Safety and Alignment Value Alignment Social Biases Data Compliance Public Benefit Corporations Risk Assessment Human Values Alignment Safety Concerns Misuse of AI AI Development Explainable AI Safety Protocols Reliability in AI Systems Bias Mitigation Privacy Concerns Subliminal Learning Model Interpretability AI Research and Development

Nature Study Finds ‘Subliminal Learning’ Lets LLMs Inherit Hidden Traits From Synthetic Data

The result spotlights fresh safety risks for distillation, the common practice of training smaller models on outputs from larger ones.

DeepMind Expands Frontier Safety Framework to Target Shutdown Resistance and Harmful Manipulation

Anthropic Unveils Automated Persona Vectors to Steer and Shield Language Models