Global IT disruption caused by a defective CrowdStrike Falcon Sensor update

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

EdwinG

Well-known member
Sep 24, 2020
91
36
A global service disruption is happening by a #CrowdStrike #FalconSensor update, causing #Windows computers to bugcheck (colloquially known as a BSoD)



CrowdStrike is rolling back the update.

Needless to say that infrastructure Information and Communications Technologies teams will have a very rough today. So, my hat off to you; I have been there, and in moments like this, it’s really not fun - even if you love your job. Take care of your self no matter what; it’s not worth a burnout.

To the people not working on this, offer to buy your Service Desk Technologists and Systems Administrators a non-alcoholic drink of their choice.
 
There is also a registry change which can block CloudStrike from starting:

HKLM:\System\CurrentControlSet\Services\CSAgent\Start
Change from a 1 to a 4 to disable CrowdStrike from starting.
I don’t think the registry key fixes it.

From what I gather, it’s a kernel-level module (driver) that causes the bugcheck.

If disabling the service works to start Windows, great! :) But don’t forget to roll back the update and re-enable the service.
 
My condolences to those IT admins that have to deal with this.

What is really irritating me about the news coverage is they are not saying anything about "if you don't use Crowdstrike services, you're okay". They are making it sound like, with terms like "global outage", that every Windows PC on the planet is affected. It's striking needless fear into some folks. I even went to bleepingcomputer.com and wired.com and their coverage does not make this important distinction (yet).
 
No, the registry hack won't circumvent the problem.

Instructions posted by parent company says, to boot into safe mode. Navigate to C:\Windows\System32\drivers\CrowdStrike directory and enter “C-00000291*.sys” into the Search Box on the Windows Explorer. Remove all the .sys files starting with C-00000291. Reboot. They reference this Microsoft link.

We were not affected at $JOB. Here at home my wife's two Windows PCs don't use Cloudstrike Falcon and I'm a FreeBSD user.
 
Last edited:
  • Like
Reactions: EdwinG
Our Windows team says they were not affected. As it was explained to me Crowdstrike pushed out a faulty update to their Falcon Sensor. Falcon Sensor is a kernel module. A fault in the kernel module caused affected Windows machines to panic (BSOD). Speaking from experience, a bug in the kernel that has been pushed out can make for a very bad day for those receiving the bad code and for those pushing out the faulty code. I feel sorry for everyone involved.
 
Last edited:
My condolences to those IT admins that have to deal with this.

What is really irritating me about the news coverage is they are not saying anything about "if you don't use Crowdstrike services, you're okay". They are making it sound like, with terms like "global outage", that every Windows PC on the planet is affected. It's striking needless fear into some folks. I even went to bleepingcomputer.com and wired.com and their coverage does not make this important distinction (yet).
When I heard the CBC newscast on the radio, around 6:30 EDT, they were already confirming that it was caused by CrowdStrike’s update.

There were two disruptions overnight, one for Azure in the central USA region, and the CrowdStrike one.
 
When I heard the CBC newscast on the radio, around 6:30 EDT, they were already confirming that it was caused by CrowdStrike’s update.

There were two disruptions overnight, one for Azure in the central USA region, and the CrowdStrike one.
Right. But they were making it sound like Crowdstrike's update was somehow tied to Microsoft Updates. It gave the impression that all Windows PCs were affected, not just ones that were paying for Crowdstrike security services. I've had several people who are not in IT ask me if it was okay to use their Windows computer, or some other related question showing confusion over which PCs were affected and which weren't. All the media needs to say is "This issue affects ONLY Windows PCs that are part of Crowdstrike's paid 3rd party security services."
 
I took time out to do a quick store run at noon. CBC is still reporting a general Internet outage. :\
 
Right. But they were making it sound like Crowdstrike's update was somehow tied to Microsoft Updates. It gave the impression that all Windows PCs were affected, not just ones that were paying for Crowdstrike security services. I've had several people who are not in IT ask me if it was okay to use their Windows computer, or some other related question showing confusion over which PCs were affected and which weren't. All the media needs to say is "This issue affects ONLY Windows PCs that are part of Crowdstrike's paid 3rd party security services."
I have somehow avoided all of that. I had one question: “is this a virus?” 😅
I guess I’m good at picking it up early enough to dispel questions.
 
Nothing of this event (cough...outage) is what we're being told, Crowdstrike and Microsoft are receiving a purposeful public whipping (payment) for their past treasonous deeds...disappeared they and many other giant tech will be. ;););)

"Does Ukraine have the DNC server like Trump says? We fact checked that."
 
I have somehow avoided all of that. I had one question: “is this a virus?” 😅
I guess I’m good at picking it up early enough to dispel questions.
You "somehow avoided all that" if you're not a customer of Crowdstrike. ;)

Seriously though, we've been told 15 new domains with the character string "crowdstrike" in them have appeared over the last 12-24 hours. Apparently some miscreants have started some phishing campaigns.
 
You "somehow avoided all that" if you're not a customer of Crowdstrike. ;)
I'm not talking about "me" in a professional capacity. I'm talking about family and friends asking about it :)

I told them: hey this is going on, be aware and be kind with the people that will help you at the office if you're having issues.
 
it's been strongly alleged by users/recipients of the c-00000291*.sys files that triggered kernel panic are:

- not "drivers," but parameter files picked up by the actual Falcon Sensor kernel driver

- these .sys files, in the past, are not internally encrypted-in-use by FS

- that these .sys files, in the past, have a regular internal format

- that the BSOD .sys files are filled with gibberish (indistinguishable from possibly encrypted); so, highly non-conforming with previous releases

- some have called normal c-00000###*.sys files "windows channel files"

can anyone here confirm/refute any/all of the above?

questions that come to my mind:

- why does cloudstrike blindly load these .sys files? why is there no pre-load validation of these .sys files?

- when FS encounters non-conforming .sys files, why does FS not FAIL GRACEFULLY, without BSOD? (client logfile entries naming non-conforming .sys files would be another good thing to expect.) (client-side automatc fallback to last known good .sys fles would be yet another good thing.)

- why is there no automated final validation-&-hold-on-fail (for internal conforming .sys file structure) at cloudstrike, prior to pushing them out to the world?

has klowntrike ever subjected Falcon to 3rd-party code/architectural review?
 
  • Like
Reactions: cschuber
Cloudstrike has commited major sins here. One bad file should not cause so much havoc. That their environment was so fragile, is inexcusable.

Yes, there should be validation of the sys files. Think Bobby Tables. Cloudstrike does not sanitize their database inputs. They are clearly a sloppy company with shitty programmers.

They failed to detect the bad file before shipping it.
They failed to stagger the release of the bad file
Their software parsed the bad file badly
Three strikes, they should be out.

But, the Cloudstrike clients share in the blame too. Blindly trusting updates is a brutally obvious mistake.

I heard two newscasts blame the crashes on Windows. So sad, the blind informing the blind. As bad as MS and Windows are, this was not their fault in any way, from what we know now.
 
As bad as MS and Windows are, this was not their fault in any way, from what we know now.
I will partially put the blame on Microsoft (and its well-known peers) for allowing kernel modules (aka drivers) to exist.

The solution is not easy, but it might be time to move away from anything external from being loadable in the operating system kernel; this would reduce the risk of such an event. And yes, I'm very aware that it will break things and get rid of what we currently know as drivers.

Will it happen? Probably not, but one can hope!

Blindly trusting updates is a brutally obvious mistake
From my understanding (see https://www.crowdstrike.com/blog/technical-details-on-todays-outage/), this was what we used to call "definitions updates" for antivirus software.

This is the kind of base update that one would expect the AV software to receive multiple times per day.
 
Last edited:
Nothing of this event (cough...outage) is what we're being told, Crowdstrike and Microsoft are receiving a purposeful public whipping (payment) for their past treasonous deeds...disappeared they and many other giant tech will be. ;););)

"Does Ukraine have the DNC server like Trump says? We fact checked that."
Huh! "Did you know that the President of CrowdStrike Services and the Chief Security Officer
served as the Executive Assistant Director of the FBI under the Obama-Biden administration?"
 
I saw a picture of a "Tweet" (UI of the image doesn't have the normal Twitter stuff I'm used to seeing in these images, unless Twitter is that different in looks on iPhone) that the CEO of CrowdStrike was the CTO of McAfee when it did the same thing during Windows XP. Actually, if this is true, this might be a case of how he figured out what the error was so quickly. He was there when it happened before and was like, "Not again!"
 
former FBI "credentials" do not guarantee technical, or even business operational, level of competence. the officers of klownstrike are not at all suspected by me of any malicious intent; but, arrogance (dissembling CYA humility in PR only), hubris ("we've been doing it this way for years," an on-air klown quote) & object incompetence (biz-op).

constantly updated parameter files, picked up kernel-mode ANYTHING, have to always be sanity checked against kernel-panic failure. checked, before issuance at their "source," & double-checked, on the "client" side, prior to actual kernel load. frequency of update is no excuse for not doing these things. auto-rollback is also required for klownstrike fail-safe. on all counts, these klowns failed.

whether the klowns make any significant & effective changes or not...

i am not debating whether or not kernal-mode, but M$ has never exercised adequate & proper control over access to & participation in it. malicious drivers? rootkits? heck yeah.

enterprise klown clients aren't completely off the hook in this, either. legally validated/certified biz operations require validated/certified/monitored/actively remediated IT systems.

no enterprises i know of or work with internally pre-valdates all patches/updates prior to placement into production. no longer funded/staffed for that in our just-in-time world.

but enterprises that do follow my advice, execute staggered rollouts of batched updates/patches into production. prioritizing well version controlled VMs as earliest adopters, they are observed for success or fail to full production boot. if fail full production boot, then freeze/block updates/patches & rollback to last known good. on 1st success, then internally serve boot-validated updates/patches to to the next enterprise segment (VM silo, biz/op silo, geo region - any reasonable segmentation will do) observe & so on. never everything, everywhere, all at once.

every update, everywhere, all at once, 24x365 is a business policy/operational/governance failure. much larger & far worse than individual klown strikes. EEAAO24x365 appears to be globally on par for client enterprise practice. who's responsible for that? who's questioning this?

the world was never completely at the mercy of only a klown strike. we allowed completely unmanaged & unconsionably brittle updates to deliver a globally showstopping sucker punch. "we" dared any klown to deliver a sucker punch.

this is not off topic, even though it fully exceeds "IT."
 
Last edited:
I saw a picture of a "Tweet" (UI of the image doesn't have the normal Twitter stuff I'm used to seeing in these images, unless Twitter is that different in looks on iPhone) that the CEO of CrowdStrike was the CTO of McAfee when it did the same thing during Windows XP. Actually, if this is true, this might be a case of how he figured out what the error was so quickly. He was there when it happened before and was like, "Not again!"
He was McAfee’s CTO between 2009 and 2011, so yes, during that faulty definitions update disruptions as well.

The Reuters article, I found it through Wikipedia. The BizJournals one, I had to search for it.

Appointment:

Resignation:
 
  • Like
Reactions: MichaelRSorg