Particle.news
Download on the App Store

Cloudflare Attributes Global Outage to Database Permission Bug, Details Fixes

The failure renewed concerns over reliance on a handful of hyperscale cloud providers.

Overview

  • Cloudflare’s post-incident account ties the Nov. 18 disruption to a ClickHouse permission change that altered metadata results, inflating a Bot Management “feature file” and crashing the core proxy with widespread 5xx errors.
  • Service degradation began around 11:20 UTC, traffic flows normalized by roughly 14:30 UTC after a rollback to a stable build at 14:24 UTC, and full restoration was completed later that afternoon.
  • The outage disrupted high-traffic sites and AI services, including ChatGPT, Perplexity, Shopify, X and Canva, and also affected internal tools such as the dashboard login via Turnstile, Workers KV and Cloudflare Access.
  • Engineers stopped generation of the bad configuration, pushed a known‑good file, restarted the core proxy and rolled back software, and Cloudflare says it will add stricter config validation, global kill switches, resource safeguards and broader failure‑mode reviews.
  • Following recent AWS and Azure incidents, industry voices highlighted systemic concentration risk and urged multi‑vendor or hybrid designs to improve resilience despite added cost and complexity.