Overview
- Cloudflare research found undisclosed Perplexity crawlers rotated user agents, IP addresses and ASNs to evade robots.txt blocks across tens of thousands of domains
- Controlled tests on hidden domains confirmed these stealth bots continued to retrieve restricted content, prompting Cloudflare to remove Perplexity from its verified bot program
- Cloudflare has added heuristics in its bot management system to fingerprint and block the evasive crawlers network-wide
- Perplexity denied any unauthorized scraping, calling Cloudflare’s evidence a misattribution and labeling the blog post a publicity stunt
- The dispute highlights industry moves to enforce site directives and develop licensing frameworks as AI systems ingest massive web content