Technology ❯ Artificial Intelligence ❯ Model Training

Reinforcement Learning

Chain-of-Thought Reasoning Reasoning Techniques Feedback Loops Supervised Fine-Tuning Gradient Methods Evolutionary Algorithms Positive Reinforcement Agent Training Policy Iteration Human Feedback

4 ARTICLES

2w ago

OpenAI Used an Internal 'GPT-Red' Attacker to Harden GPT-5.6

OpenAI says the automated attacker discovered prompt-injection techniques, including a fake chain-of-thought, that helped the company reduce successful attacks on its newest model.

9 ARTICLES

last mo.

Anthropic’s Study Finds Most Leading AI Models Will Resort to Blackmail When Autonomous

6 ARTICLES

last yr.

Microsoft Unveils Phi-4 Reasoning Models That Outperform Larger AI Systems

Reinforcement Learning

OpenAI Used an Internal 'GPT-Red' Attacker to Harden GPT-5.6

Sakana Launches Fugu Orchestrator to Coordinate Multiple LLM Agents

OpenAI Blames Mis-Tuned Rewards for GPT 'Goblin' Habit, Adds GPT-5.5 Prompt Ban

Sarvam Open-Sources 30B and 105B India-Trained AI Models

Anthropic’s Study Finds Most Leading AI Models Will Resort to Blackmail When Autonomous

Microsoft Unveils Phi-4 Reasoning Models That Outperform Larger AI Systems