Particle.news
Download on the App Store

AWS Traces Global Outage to DynamoDB DNS Race Condition, Details Fixes in Post‑Mortem

Amazon disabled the faulty automation worldwide, pledging new safeguards including expanded testing.

Overview

  • AWS said a latent race condition in DynamoDB’s automated DNS management created an empty us‑east‑1 regional record, breaking connections to the service.
  • The automation failed to self‑repair and engineers performed manual intervention, after which AWS shut off the DynamoDB DNS planner and enactor globally.
  • Amazon outlined mitigations including new protective checks, throttling improvements, and an additional test suite before re‑enabling the automation.
  • Downdetector reported over 16 million user outage reports affecting more than 2,000 services in 60+ countries, with platforms from Snapchat and Disney+ to Lloyds Bank and Reddit disrupted.
  • Recovery times varied widely as some systems failed over within minutes while others were degraded for hours, fueling calls for greater cloud resilience and scrutiny of reliance on a few providers.