World

IT chaos could take days to fix, experts warn

July 19, 2024
There were long lines at Barcelona airport, as passengers waited to be checked in manually
There were long lines at Barcelona airport, as passengers waited to be checked in manually

LONDON — The boss of cyber-security firm Crowdstrike has admitted it could be "some time" before all systems are back up and running after an update from the company triggered a global IT outage.

Experts are warning that it could take days for big organizations to get back to normal.

Although there is now a software fix for the issue, the manual process required will take a huge amount of work, they said.

The global outage has led to almost 1,400 flights being canceled, while banking, healthcare and shops have all been affected.

The issue was caused when an update from Crowdstrike caused Microsoft systems to "blue screen" and crash.

The problem piece of software was sent out automatically to the firm's customers overnight which is why so many were affected when they came into work on Friday morning.

It meant their computers could not be restarted.

Writing on X, Crowdstrike chief executive George Kurtz said: "The issue has been identified, isolated and a fix has been deployed."

In an interview on NBC's Today Show in the US, Kurtz said the company was "deeply sorry for the impact that we've caused to customers".

"Many of the customers are rebooting the system and it's coming up and it'll be operational," he said, but added: "It could be some time for some systems that won't automatically recover."

The fix will not be automatic, but what the industry calls a "fingers on keyboards" solution.

Researcher Kevin Beaumont said: “As systems no longer start, impacted systems will need to be started in ‘Safe Mode’ to remove the faulty update.

"This is incredibly time-consuming and will take organisations days to do at scale."

Technical staff will need to go and reboot each and every computer affected, which could be a monumental task.

Crowdstrike is one of the biggest and most trusted brands in cyber-security.

It has about 24,000 customers around the world and protects potentially hundreds of thousands of computers.

The wording of Kurtz's statement suggests the overnight update was supposed to be small, describing it as a "content update".

So it was not a major refresh of the cyber-security software. It could have been something as innocuous as the changing of a font or logo on the software design.

That could potentially explain why the software was not as rigorously checked in the same way that a major update would have been. But it also poses the question: how could a small update do so much damage?

One struggling IT manager said the process to get computers back up and running is quick once an IT person is at the machine, but the problem is getting them to the machines.

The person, who wished to remain anonymous, is responsible for 4,000 computers in an education company and said his team were working flat out.

“We have managed to fix all of our servers using the command prompt as a workaround, but for many of our PCs, it's not easy to do manually as we are spread out across five sites. Any PCs that are left switched on overnight are affected and we're rebuilding them,” he said.

IT experts say this manual process will be particularly hard in large organisations with thousands of computers that are potentially under-resourced in IT.

Small and medium-sized businesses without dedicated IT teams or which outsource their IT issues might also struggle.

The larger, more resourced companies, like American Airlines, appear to be fixing the problems rapidly.

Interestingly it looks like many in the US might be less affected as computers that are potentially not yet switched on can be started up to download the corrected software instead of the bad version. But that might still involve a level of manual operation.

Beaumont said that one of the world’s "highest impact IT incidents" was "caused by a cyber-security vendor".

Ironically if a customer was affected by this it was because they followed all the usual advice that is issued by cyber-security experts – install the security updates when you receive them.

While some security companies in the past have accidentally send out a dodgy software update, we’ve never seen one at this scale and this damaging.

While this incident has caused widespread disruption, the WannaCry cyber-attack in May 2017 was potentially worse.

That was a malicious cyber-attack that affected an old version of Microsoft Windows and spread automatically to any computer that had the old and unprotected Windows software.

It affected an estimated 300,000 computers in 150 different countries.

It hit the NHS for days, affecting doctors' surgeries and hospitals around the country.

In that case it was an attack thought to be carried out by North Korea that got out of hand.

The NotPetya attack a month after that was eerily similar in method and damage.

In contrast, the outages on Friday are a mistake and not an attack. — BBC


July 19, 2024
435 views
HIGHLIGHTS
World
5 hours ago

Russia jails US journalist Gershkovich for 16 years

World
9 hours ago

Israeli man killed in drone attack on Tel Aviv

World
10 hours ago

South Korea makes N. Korean defector vice minister