
19.07.24 CrowdStrike Root Cause Analysis (RCA)
“Anything that can go wrong, will go wrong.” Murphy’s Law

After nearly a decade of learning, contributing, and supporting the software life cycle development — spanning about six years in QA and QC areas — I would like to share my thoughts, knowledge, and perspective on the 19/07/2024 CrowdStrike cloud outage. This outage impacted sectors from financial services to media to air travel, highlighting the critical role of robust IT infrastructure in our economy.
I’m still collecting data on what exactly happened and whether the CrowdStrike update rollout caused an error that led to computers running Microsoft Windows operating systems to crash and fail to restart.
Not all software products should follow the “Build fast, break fast” model.
In the intricate world of software development, especially for self-service, multi-platform software applications with myriad features, it is impossible to guarantee that all QA, QC, and overall quality gates will prevent bugs from occurring.
From my experience, there is no foolproof method to release software with absolute confidence in achieving zero escape defects. The goal is to minimize the risks of escape defects. Every Company, with their unique technology, product, organization, and culture, defines dedicated QA and QC gates differently.
The constant goal is to always reinvent processes to chase and minimize the risks of escape defects with all possible strategies: Quality Mindset, Continuous Improvement, Collaboration and Transparency, Automation and Tooling, Customer Feedback, Data-Driven Decision Making, Resilience and Reliability, Scalability and Performance, Security and Compliance.
A small note: if you hear people confidently claiming they know how to achieve 100% confidence in preventing escape defects, they probably don’t know what they are talking about.
It is essential to consistently invest in and focus on quality in every aspect of the software life cycle development. This is the key to preventing incidents like the 19/07/2024 CrowdStrike cloud outage.
Invest in QA and QC gates. Shifting left and right supports growth.
