Particle.news
Download on the App Store

AWS Blames DynamoDB DNS Race Condition for Global Outage, Disables Automation

AWS responds by disabling the automation worldwide, adding safeguards, expanding tests.

Overview

  • Amazon’s post-mortem attributes the outage to a latent race condition in DynamoDB’s DNS management that generated an empty us-east-1 endpoint record and blocked connections.
  • The fault cascaded through dependent services and network load balancers, defeating automated recovery and requiring manual operator intervention to restore service.
  • AWS has turned off the DynamoDB DNS planner and enactor globally while adding protective checks, improving throttling, building new test suites, and updating EC2 and load balancer behavior.
  • Downdetector and Ookla reported over 16 million user problem reports and more than 2,000 affected services across 60+ countries, with disruptions hitting apps such as Snapchat, Reddit, Roblox, Venmo and banking sites.
  • Most services recovered within hours, and the incident renewed calls from experts and some policymakers for multi‑region designs, greater diversification, and closer regulatory scrutiny of major cloud providers.