Particle.news
Download on the App Store

AWS Pins Global Outage on DynamoDB DNS Race Condition, Suspends Automation Worldwide

The post-mortem describes an empty regional DNS record in US‑East‑1 that cascaded through core services, exposing how concentrated dependencies can ripple across the internet.

Overview

  • Amazon said a latent race condition in DynamoDB’s automated DNS management produced an incorrect empty record for the us‑east‑1 endpoint that automation failed to repair, forcing manual intervention.
  • AWS apologized, disabled the DynamoDB DNS planner and enactor globally, and is adding protective checks, throttling controls, and expanded test suites before re‑enabling the automation.
  • The failure triggered downstream issues across EC2 and Network Load Balancer and caused prolonged replication lag for DynamoDB global tables linked to the Northern Virginia region.
  • Downdetector recorded over 16 million user reports spanning more than 2,000 services across 60+ countries, with major brands such as Snapchat, Roblox, Reddit, Ring, Venmo and multiple banks affected.
  • The disruption renewed calls for multi‑region and multi‑provider designs and stronger safeguards, while some customers introduced product mitigations, including Eight Sleep enabling local Bluetooth control during cloud outages.