CrowdStrike BSOD error: Risking future of AI in cybersecurity?

By Jayesh Shinde | Updated on 19-Jul-2024

19-Jul-2024

On an unsuspecting Friday, when most office workers were trying to get through the work day with one eye on the weekend, one of the giants in the cybersecurity world, CrowdStrike, faced an unprecedented crisis. On a day that will be remembered for its digital chaos, the AI-powered security platform experienced a massive outage that left users worldwide staring at the dreaded Blue Screen of Death (BSOD) on their Windows machines.

CrowdStrike, known for its cutting-edge AI-driven cybersecurity solutions, saw its services crash due to a problematic update. Reports flooded in from users across the globe – Australia, India, the Czech Republic, and beyond – complaining of their work PCs and laptops running on Windows operating systems experiencing sudden and inexplicable BSOD errors. Reddit user TipOFMYTONGUEDAMN was among the first to sound the alarm, noting that CrowdStrike servers were down. The infamous BSOD, a critical error screen displayed by Windows, indicated severe system issues preventing safe operation. This disruption wasn’t limited to a single region, making it a global issue with widespread repercussions.

Also read: CrowdStrike update: What is Blue Screen of Death? The issue affecting all Windows machines

The root cause of the problem lay in a recent update to CrowdStrike’s Falcon Sensor, the glitch caused due to a software update of CrowdStrike’s EDR (Endpoint Detection & Response) product, according to Omer Grossman, CIO of CyberArk, an Israeli cybersecurity firm. “A malfunction in this can, as we are seeing in the current CrowdStrike incident, cause the operating system to crash,” emphasised Grossman, highlighting how the damage to business processes at the global level was so dramatic. And that’s exactly what happened, when this seemingly routine CrowdStrike update triggered a catastrophic failure for its hundreds of thousands of customers running on Microsoft’s Windows machines, resulting in BSOD errors that rendered their workstations useless for several minutes to multiple hours.

“These outages are increasing in volume due to the sheer increase in the number of online users and traffic,” suggested Jake Moore, Global Security Advisor at ESET, a global cybersecurity provider. “After witnessing the blue screen of death (BSOD), many people are quick to suspect a cyberattack or find similarities to Netflix’s Leave The World Behind but this can often add to the confusion. It highlights the importance of these services and the millions of people they serve,” he further highlighted.

Even as CrowdStrike acknowledged the issue, users looking for answers flooded social media platforms like X (formerly Twitter) and Reddit to share their mounting sense of helplessness and frustration. Platform fanboys from Mac and Linux users were seen rubbing it into poor Windows users impacted by the CrowdStrike BSOD error, boasting about their superior crash-proof operating systems.

Also read: CrowdStrike outage: Delhi Airport moves to manual check-ins as Indigo, Akasa and major airlines are affected

While CrowdStrike rushed to deploy a fix for customer workstations suffering from Windows’ BSOD, CyberArk’s Omer Grossman believes that getting those systems online and restoration of business process continuity will take time. “It turns out that because the endpoints have crashed with the BSOD error, they cannot be updated remotely and this problem must be solved manually, endpoint by endpoint. This is expected to be a process that will take days,” according to Grossman.

Microsoft said in a statement that it was “aware of an issue affecting Windows devices due to an update from a third-party software platform,” and also that it was working to restore Azure services “as quickly as possible.” It is crazy to think that the old and ignominious Microsoft Windows’ BSOD error – which is usually the result of hardware malfunctions, driver problems, or abrupt termination of essential operating system services – ultimately became a global symbol of the CrowdStrike cybersecurity platform’s epic failure.

AI in cybersecurity: Is it flawed?

CrowdStrike positions itself as an industry leader in AI-powered security solutions. In fact, CrowdStrike’s Falcon platform boasts to offer the “industry’s most complete AI-native defence, trained on the world’s highest-fidelity security data and augmented by ground truth from CrowdStrike’s elite threat hunters”.

Needless to say, this recent CrowdStrike incident puts a huge dent into that claim, and raises critical questions about the potential limitations and risks of AI in cybersecurity. While AI promises enhanced detection and response capabilities, this event underscores the need for robust oversight and validation processes to mitigate unforeseen failures. Can AI truly be relied upon to manage our most sensitive security needs, or are we placing too much trust in an imperfect system?

Also read: AI impact on cybersecurity future: The good, bad and ugly

According to ESET’s Jake Moore, another aspect of this CrowdStrike incident relates to diversity in the use of large-scale IT infrastructure – especially in terms of critical systems like operating systems, cybersecurity products, and other globally deployed applications. “Where diversity is low, a single technical incident, not to mention a security issue, can lead to global-scale outages with subsequent knock-on effects,” he said.

“The inconvenience caused by the loss of access to services for thousands of people serves as a reminder of our dependence on Big Tech such as Microsoft in running our daily lives and businesses. Upgrades and maintenance to systems and networks can unintentionally include small errors, which can have wide-reaching consequences as experienced today by Crowdstrike’s customers,” Moore further emphasised.

Maybe blaming AI wholeheartedly for this CrowdStrike outage isn’t wholly correct, and takes a superficial view at the problem. While CrowdStrike’s CEO has publicly denied the outage’s cause as a cyberattack, CyberArk’s Grossman suggested the entire cybersecurity world will be waiting for CrowdStrike’s transparency and disclosure on the matter. “The range of possibilities ranges from human error – for instance a developer who downloaded an update without sufficient quality control – to the complex and intriguing scenario. CrowdStrike’s analysis and updates in the coming days will be of the utmost interest,” he summarised.

The CrowdStrike incident shines a spotlight on the perennial challenge of trust in the world of cybersecurity. When a leading security solution provider suffers such a critical failure, it shakes the confidence of its user base. And as we expand the frontiers of AI application across various industries, this incident is a reminder that AI, despite its advancements, is not infallible. It highlights the importance of continuous human oversight and the necessity of rigorous testing before deploying updates that can impact millions of users worldwide. As cybersecurity challenges evolve, companies like CrowdStrike must lead the way in responsible use of AI, ensuring that innovation does not come at the expense of reliability and user trust.

LIVE: CrowdStrike DOWN! Many users affected

Also read: Latest CrowdStrike update: Microsoft down, airlines affected, banks shut – Story so far

Jayesh Shinde

Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile