A couple of hours back, hundreds of thousands of websites consuming Cloudflare’s web service became unreachable, in what was the first outage that the company has ever faced. Cloudflare is undoubtedly the world’s largest infrastructure company, known for various vital services. These include protection against DDoS attacks, identity thefts and distributed DNS services.
Major sites that run on Cloudflare and stood affected include Discord, Netflix, Feedly Politico, Shopify, League of Legends, Gitlab, Patreon, Medium and as ironic as it may sound, even Downdetector.
The cloudflare backbone connects several data centers across the world though private lines laid across. This helps them maintain a faster loading time for client websites. It was the popularly used 184.108.40.206 DNS of the company caused the outage for several websites.
Claimed to have lasted for 27 mins, Cloudflare witnessed a 50% drop in traffics across their network. In their official blog, they said, “The outage occurred because, while working on an unrelated issue with a segment of the backbone from Newark to Chicago, our network engineering team updated the configuration on a router in Atlanta to alleviate congestion. This configuration contained an error that caused all traffic across our backbone to be sent to Atlanta. This quickly overwhelmed the Atlanta router and caused Cloudflare network locations connected to the backbone to fail.”
Realizing how brutal a data congestion can be, the firm has announced a few changes to avoid havoc further. They will introduce a limit on the backbone BGP (Border Gateway Protocol) sessions which will be deployed on July 20th. Also, they have changed the BGP local preference to local server routes, thus preventing mixing of traffic from other locations.