Overview
- Amazon said a latent race condition in DynamoDB’s automated DNS management produced an incorrect empty record for the us‑east‑1 endpoint that automation failed to repair, forcing manual intervention.
- AWS apologized, disabled the DynamoDB DNS planner and enactor globally, and is adding protective checks, throttling controls, and expanded test suites before re‑enabling the automation.
- The failure triggered downstream issues across EC2 and Network Load Balancer and caused prolonged replication lag for DynamoDB global tables linked to the Northern Virginia region.
- Downdetector recorded over 16 million user reports spanning more than 2,000 services across 60+ countries, with major brands such as Snapchat, Roblox, Reddit, Ring, Venmo and multiple banks affected.
- The disruption renewed calls for multi‑region and multi‑provider designs and stronger safeguards, while some customers introduced product mitigations, including Eight Sleep enabling local Bluetooth control during cloud outages.