Attributed to Alexander Liskin, Head of Threat Research, Kaspersky
The whole world has been following the news of a global IT outage affecting thousands of business entities across the world, including airports and banks. It is already known that it was caused by a software update issue released by cybersecurity vendor Crowdstrike.
Based on media reports, the number of affected companies, and the devices they use, may exceed hundreds and thousands. At this stage, it is difficult to estimate how long it will take to fix the issue, since the difficulty lies in the fact that when such a problem occurs, each device (computer, laptop or server) must be rebooted into safe mode manually; this cannot be done using management tools. This is indeed a very serious problem that has affected numerous processes, including those in critical infrastructure.
To avoid such situations, information security vendors need to be highly responsible about the quality of the updates they release. At Kaspersky, all updates are accompanied by a significant number of internal tests and checks. Until they are passed, the release will not be rolled out to customers. Since 2009 we've been running an internal framework to prevent mass failures among customers. Within this framework, each update undergoes a multi-level quality check. This allows us to fix every problem that is identified before a release, analyze the reasons behind each issue and develop preventive measures accordingly.
It is also important to adhere to the principle of a granular release of updates. This means that they are not distributed globally to all customers simultaneously, but gradually, so that in case of any unforeseen failure, it is possible to localize and fix it quickly.
In addition, it is necessary to monitor and immediately respond to any situation by urgently stopping updates.
If any unexpected issues affecting our users arise, we always register them, with the appropriate priority, and analyze what measures need to be taken and implemented. Solving the problem becomes a priority at all levels in the company. As with all cyber incidents, it is crucial not only eliminate visible damage, but also to find and fix the root cause in order to prevent similar incidents in the future.