Cybersecurity and process safety: An integrative approach to risk management
In today’s world, interconnectivity, digitalization, automatic control systems and other technological advances permeate both work and play. Indeed, the same tools people use on a daily basis to optimize their private lives have also been adapted to optimize industrial processes of every stripe. Today, almost all process plants have industrial control systems (ICS) embedded in the various levels of the company’s digitalization, from field devices (instruments, actuators, relays, etc.) to the highest level of corporate servers.
These systems remotely monitor and control worksites, acquiring and transmitting data without requiring personnel to travel long distances. The devices that make up an ICS can open and close valves and breakers, collect data from sensor systems and monitor the local environment. Within a single plant, an ICS can centrally control the various phases of production, gather and share data for quick access and find and remedy faults while reducing their overall impact. Efficiency is not the only advantage to an automated system. Worker health and safety also benefit from these systems’ ability to detect danger quickly and reliably.
However, no system is invulnerable. In an industrial context, a technology malfunction can lead to financial losses, asset damage, environmental consequences and even injury to humans or loss of life. The scale of the consequences can be massive and can also be the result of criminal activity that targets vulnerabilities in these automated, centralized cyber systems.
Facing the downside of digitalization
The scope of the damage that can be done when organizations fail to establish robust, resistant cyber protections is far greater than what may befall an individual technology user. When a plant fails or struggles financially, when the air or water is polluted, or employees’ health and safety is compromised the effects are far-reaching. Because the stakes are so high, industry leaders must understand that cyber threats are just as potent as the safety risks they have confronted traditionally and can indeed hijack the conventional safety measures they have put in place. In the cyber age, it is possible to disable alarms, manipulate controls or tamper with the signals workers rely on to ensure safety without direct physical access.
Human error, the culprit behind many industrial accidents, continues to play a role in cyber-related disasters. Employees or contractors may inadvertently plug an infected machine into the system, connect to an unsecured network, download the wrong program or install malware. What is new, is the increased potential for remote attacks. A disgruntled employee who knows the system may be motivated by revenge. Hackers may break in to the network for financial gain or political advantage. Those seeking a competitive edge may steal secrets or cripple production. Other cybercriminals may be intent on disrupting critical infrastructure from nuclear plants to water supplies to electrical grids. Whether small or large, simple or sophisticated, the risks created by advancing technology demand the attention of industry leaders.
Against this backdrop, safety authorities pose two main questions to their industrial clients and partners. First, if a cyberattack is underway, what security measures are preventing it? Secondly, when (not if) a cyberattack succeeds, what is the ultimate risk to people?
Both of these questions are crucial, but it is important to highlight the essential difference between them: one is concerned with attack prevention and the other identifies the ultimate unwanted risks to people.
Hackers make headlines
In 2018, hackers made the biggest headlines with attacks on financial and political institutions, but infrastructure also fell victim. In addition to the high-profile assault on Britain’s National Health Service in April, a cyberattack accessed U.S. power grids over the summer. No damage was reported, but the perpetrators were able to gain vital information that could be used to inflict greater harm in the future.
So far, the results of most published cases of cyberattacks aimed at industry have been limited to economic damage. In 2017, the petya virus was behind a 3% drop in one large company’s quarterly sales figures and resulted in a loss of £110 million (approximately $140 million) for another company. However, it is easy to imagine far worse outcomes. Corporate spies could exploit network weaknesses to steal secrets, sabotage production and inflict lasting damage on competitors. Terrorists could target plants that use hazardous substances as part of an attack on the civilian population, causing explosions, contaminating the air or water supplies and taking human life. These are not risks worth running. They require a systematic analysis and a proportionate response.
Cyber protection with process safety tools
As frightening as these scenarios may be, it is important to realize that industry can leverage many of the tools it already employs as part of process safety management in the fight against cyber threats. Both process safety and cybersecurity aim to prevent or mitigate events involving a loss of control of hazardous materials and energy sources. Recognizing and exploiting this overlap is key when building robust cyber defenses.
The risk-based approach at the heart of the process safety life cycle extends successfully to cybersecurity in an industrial process context. Risk measurement frameworks traditionally used in process safety work equally well for cybersecurity. At the same time, each discipline has a distinct life cycle requiring continuous management, and each affects multiple and overlapping aspects of industrial processes.
A formula for calculating risks
The general principle used in process safety for assessing risk is applicable universally, wherever hazardous situations arise. Essentially, the level of risk is a product of the consequences produced by the hazard multiplied by the probability of those consequences coming to pass (Figure 3).
In a cyber context, perhaps the hazard is that sensors used to indicate dangerous levels of certain substances become disabled as a result of hacking, technical malfunctions or user error. The consequences might include damage to machinery or other equipment or even injury to personnel. A worst-case scenario could involve an explosion that injures or kills people and releases toxins into the environment. The consequences would of course vary depending on the specifics of the plant in question, as would the third element, probability. This refers to the likelihood of an incident occurring. In process safety, this is a real number from 0 to 1. If an event is nearly certain, the probability assigned is near 1; if it is almost impossible, nearly 0.
The example above demonstrates the complexity of industrial hazards and underscores the importance of cooperation between environment, health and safety (EHS), IT and operations teams when confronting cyberthreats. There are no longer well-defined lines of demarcation among these divisions; the success of one in combatting hazards is dependent on the others.
Interconnectivity means interdependence
The process safety life cycle is typically conceptualized as four continuously repeating phases (Figure 4).
The simplicity of Figure 4 belies the complexity of the task, however. For instance, identifying hazards has to go beyond the superficial to be effective, and this requires experience and expertise. Current process safety management uses tools such as hazard identification (HAZID), hazard and operability study (HAZOP), control hazard and operability study (CHAZOP) and failure mode and effects analysis (FMEA) to facilitate this step, and these tools demand the input of professionals with an intimate knowledge of the processes in question. When processes are automated or digitalized, not only must health and safety officials and operations supervisors have a place at the table, but cyber experts must as well.
The same goes for the second phase, risk assessment. Here, too, process safety specialists have developed instruments such as safety integrity level (SIL) and level of protection analysis (LOPA) to evaluate risk. Adapted for use in a cyber context, these tools ensure proper independence of safety measures, as required by safety standards. To assess the resistance of a cyber network to attack, it is vital to investigate its weaknesses and points of access. Process safety tools can aid in these endeavors.
Managing risks means reducing their impact and frequency. Again, cooperation across disciplines is essential for effective risk management as industrial processes become increasingly intertwined with cyber networks. Solutions designed by interdisciplinary teams drawn from EHS, operations and IT will undoubtedly prove more robust in the face of new technological hazards than single-discipline approaches.
The final phase, revision or review, can include audits, training programs, accident investigation and other forms of consolidation. It propels the life cycle onward as new information comes to light regarding either internal blind spots or external developments and advances. With the rapid changes taking place in technology, this is an especially important step for a robust, resistant cybersecurity system.
HAZOP with a cyber twist
One of the most popular process hazard assessment (PHA) tools used to identify dangers (phase 1 of the process safety life cycle) is the HAZOP study. Cybersecurity issues can integrate into HAZOP with some care, which enables experts to undertake a cybersecurity assessment of the process. This assessment evaluates not just the causes of, but also the safeguards against particular hazards. It must pay particular attention to the independence of safeguards in terms of their vulnerability to cyberattacks, as well as identifying the ultimate risk to people.
A cybersecurity assessment starts by looking at the cause of a given scenario, or the factors contributing to a deviation from normal processes. For instance, if a hazard arises from a technological failure affecting a reactor’s automated temperature control loop, then the cause of this hazard is considered vulnerable to cyberattack. Conversely, if human error leads to an incorrect catalyst charge to the reactor, the cause is not vulnerable to cyber manipulation.
A cybersecurity assessment also considers the different safeguards in place to ensure normal functioning, evaluating each of them separately. A safeguard is any mechanism intended to prevent accidents or to limit damages should an incident occur. An automated high-pressure alarm is a type of safeguard that is vulnerable to attack by cyber criminals while a pressure relief valve or rupture disc is not. In a cyberattack situation, operators may find themselves relying on display data that has been manipulated to hide the actual attack. Alarms require operator action, and not only could the alarm itself be false, but the status of the process plant could be inaccurate as well. Alarm systems are therefore very vulnerable to cyberattack.
If both causes and safeguards are vulnerable to cyberattack, and there are no safety measures available that are resistant to such attacks, then the cybersecurity assessment turns to the consequences: potential damage to people and the environment. Assessments can include the risk of a cyberattack on production, assets and reputation.
At this point, the cybersecurity assessment has reached its objective: identification of potential hazards and operational problems, in this case those that can be provoked by a cyberattack. The report lists all the available safeguards and their vulnerability to attack. The generation and design of appropriate solutions takes place in subsequent phases of the process safety life cycle.
Moving forward: Integrated process safety management
Industrial control systems, like social media and online banking, are a fact of life in a digital age. The challenge is how to reap the benefits while minimizing the risks. Fortunately, industry can expand proven process safety methodologies to strengthen resistance to cyberattacks. Indeed, cyber risks can be easily integrated into process hazard analyses in a way that prevents the unnecessary duplication of effort or expense. It is a matter of intelligently adapting existing process safety tools and recognizing the interdependency of IT, EHS and operational concerns.
An experienced interdisciplinary team can effectively manage conventional process safety while simultaneously identifying and analyzing scenarios whose causes and safeguards are vulnerable to cyberattack. Enlisting the help of third-party process safety experts can ease integration of a cyber dimension into organizations’ safety management systems. Among the many uncertainties digitalization brings, one thing is certain: industry cannot afford to neglect cybersecurity issues.
Dr. Arturo Trujillo is global director of process safety consulting at DEKRA. His main areas of expertise are diverse types of process hazard analysis (HAZOP, What-if, HAZID), consequence analysis and quantitative risk analysis. He has facilitated more than 200 HAZOPs over the past 25 years, especially in the oil and gas, energy, chemicals and pharma industries. Trujillo may be reached at [email protected].
Clive de Salis is principal process safety specialist and consultant in process design safety, critical instrumentation and hazards. He writes both the IEC62443 series of standards on cybersecurity and the IEC61508 series, which includes IEC61511 on SIL rated systems. His main areas of expertise are process risk assessment, including HAZOP, with extensive experience in the design and installation of safety systems and determination of safety integrity levels. His recent experience includes expert witness selected by barristers and solicitors for dust explosions. He may be reached at [email protected].