Overview
- OpenAI leaders forecasted that AI agents would handle real-world work in 2025, yet broad, reliable autonomy failed to arrive by year’s end.
- OpenAI has de-emphasized agent development to refocus on its core chatbot, according to reporting on an internal memo.
- Agents show strong results in text and terminal-based coding tasks, but they falter on point‑and‑click web work with slow, brittle interactions.
- A July review of ChatGPT Agent documented minutes‑long stalls and basic browser failures, underscoring current GUI limitations.
- Reliability remains a core blocker, with demos and benchmarks noting factual errors and about a 10% hallucination rate in GPT‑5 variants, prompting efforts around Model Context Protocol and Google’s Agent2Agent as longer‑term fixes that will also require bot‑friendly infrastructure.