Cybersecurity giant CrowdStrike said a recent software update caused a major global technical outage, affecting around 8.5 million Microsoft devices around the world.
While this incident only affects less than 1% of all Windows computers in use, it has had significant effects on several key areas and demonstrates just how far-reaching modern digital infrastructure can be.
In a blog post, Microsoft clarified just how widespread the issue is: “We currently estimate that CrowdStrike’s update has affected 8.5 million Windows devices, or less than 1% of all Windows machines.” While this figure is a tiny fraction of the total number of Windows devices, the impact is widespread and highlights CrowdStrike’s leadership in cybersecurity.
Impact across multiple industries
The outage has affected multiple industries.
1. Aviation: Thousands of flights were canceled, leaving passengers stranded or experiencing significant delays. Delta Air Lines, one of the worst-affected airlines, reported more than 600 flights were canceled by Saturday morning, with more cancellations expected.
2. Broadcasting: Several broadcasting stations were forced to go off the air and media services were disrupted.
Healthcare and banking: Customers were unable to access essential services such as healthcare and the banking system.
3. Government and corporate sector: More than half of the Fortune 500 companies and key government agencies such as the U.S. Cybersecurity and Infrastructure Security Agency relied on CrowdStrike’s software, so the impact of the outage rippled across both the public and private sectors.
Technical details of the incident
The company concluded that the inaccessibility was due to CrowdStrike’s use of a patch for its widely used Falcon sensor software. The update was intended to improve cybersecurity to protect against new threats. However, there was a bug in the code of the update file, and many clients experienced crashes while working on Microsoft Windows.
Security experts, including Steve Cobb, CSO of Security Scorecard, say the file must have found a way to get past any vetting or sandboxing process used for testing.
The problem lies with “files that contain either configuration information or signatures,” said Patrick Wardle, a security researcher who specializes in operating system threats, which are important for recognizing certain types of malicious code or malware.
Public images of the outage, including the infamous “blue screen of death” – an error message that appears on affected computers – have been widely shared on social media platforms.
CrowdStrike has provided information to repair systems damaged by this incident, however the measures required to restore the systems are extensive and will be a daunting task as the flawed code must be manually removed from each affected system.
Microsoft is joining the recovery efforts. The software giant is working with CrowdStrike to create a rapid fix for Microsoft’s Azure infrastructure. Additionally, Microsoft is reaching out to other major software providers, including Amazon Web Services and Google Cloud Platform, to inform them of its observations and the impact on the industry.
Implications and lessons for industry
The incident is a stark reminder of the potential risks associated with widely used cybersecurity software and the need for rigorous testing protocols. John Hammond, principal security researcher at Huntress Labs, emphasized the importance of a more cautious approach to software updates: “Ideally, they should have rolled it out to a limited pool first. That’s a safer approach to avoid major disruptions like this.”
The outage highlights the delicate balance between frequent security updates and the need for thorough testing. Patrick Wardle said, “It’s very common for security products to update their signatures once a day because they want to continually monitor for new malware and protect their customers from the latest threats.” However, this frequency may have led to insufficient testing in this case.
Historical background and industry trends
This isn’t the first case involving a prominent cybersecurity company — McAfee shut down hundreds of thousands of machines with a buggy antivirus update in 2010 — but the global impact of CrowdStrike’s downtime shows just how big a mark one company can make on every sector of the industry as more and more businesses come to rely on cybersecurity software.
For all affected organizations currently working hard to rebuild their systems, this incident serves as a reminder of how tightly intertwined everything is in the digital ecosystem. At the same time, it stands out as a test of very rigorous testing policies, the need to restructure our approach to gradually deliver critical updates, and establish fail-safe plans that can be put into action in the event of a recurrence.
CrowdStrike’s outage also raises questions about whether there is undue concentration of risk in the cybersecurity industry, and whether these outages are further evidence of the need to diversify security solutions within the system.
This will undoubtedly be a powerful reference point as the digital world continues to shift and update the conversation around best practices in software development, testing, and deployment, especially across critical infrastructure and security systems.
(Photo: Joshua Horne)
See also: The day CrowdStrike broke the internet, China was barely affected. Here’s why
Want to learn more about cybersecurity and cloud from industry leaders? Check out the Cyber Security & Cloud Expo in Amsterdam, California, and London. Find out about other upcoming enterprise technology events and webinars hosted by TechForge here.