Power system security is much too big a topic for this article. But I can squeeze in a few points about vulnerability to cyber attacks. First, this topic follows naturally from Part 2’s discussion of grid control (remember that flyball governor?). Second, I feel that’s apropos in an era where the public believes that hacking the power grid is both easy and devastating. Books like, Lights Out: A Cyberattack, A Nation Unprepared, Surviving the Aftermath, by Ted Koppel, reach the New York Time’s bestseller list. In my opinion the book is hogwash. Koppel paints a grim picture in which almost any hacker can attack the grid and thus plunge the country in to a Mad-Max like post-apocalyptic future. Koppel’s book is reminiscent of Gary North’s fear campaign on Y2K that needlessly frightened millions of people, while Mr. North profited.
It is a truism among cyber security experts, that the first to fall are those who say, “I am covered. Nothing could possibly go wrong.” I’m not going to say that here, but I do want to point out some obstacles (independent of cyber security measures) that grid hackers would face even if they do defeat the cyber security. That is why I said Cyber Resilience rather than Cyber Security in the title of this article.
First, think of how and where to plan a cyber attack on the grid. Consider the pyramid of devices associated with grid operations for a state or a region. At the top there is only one master oversight system which can be expected to have better protection than lower levels but also be a more tempting target. The lowest level involves a million or more independent devices, but the effects of knocking each out one at a time are none to local.
Level 1, The Physical Layer
The power grid is constantly exposed to ice storms, hurricanes, earthquakes and other natural disasters. For example, an ice storm in 1998 affected NY, New England, and Quebec, created more than 300,000 simultaneous failures on the grid, without causing a cascading blackout that spread to other areas. It took several months to restore power to all customers, but most were restored in hours. Ditto for the 1989 Loma Prieta Earthquake in San Francisco. Ditto for Hurricane Katrina.
Multiple simultaneous failures are the daily bread and butter for the power grid. Cascading blackouts are not.
But what about strategically targeted attacks? Some transformers are nearly as big as houses, difficult to protect from sniper rifles, and difficult to replace. Attacking the most strategic targets might force the grid to split into isolated islands for a while. Power islands may not have all the capacity they want, and we may be unable to ship as many huge blocks of power large distances, leading to curtailment of some customers some of the time. The name for the voluntary form of curtailments is “demand response” (recently a hot topic in the U.S. Supreme Court).
How do I know this stuff about level 1? Because I have worked in power system planning. System planners simulate hypothetical events (both man-made and natural) continuously, testing the resiliency of the grid. In fact, that is the planner’s primary function.
Level 2 – Local Digital Devices
Mostly in substations, we have sensors, actuators, PLC controllers, protective relays, surveillance, and other devices, some digital some analog. Some have no remote communication ability, some do. Hackers targeting these devices expect that cyber security at this level is less than at higher levels. Nevertheless, there is one huge obstacle they must overcome – diversity.
In my experience, the utility industry has bought and installed instances of almost every product from every manufacturer who ever offered products for sale. Some are updated to newer versions, some aren’t. Every age of digital device, from 1 day, to 1 year, to 40 or more years old exist.
Computer hacks are typically limited to a specific make, model and version. The infamous Suxnet virus for example, targeted specific Siemens PLCs, using MS Windows XP as a vector. Diversity makes this harder. To devise an attack effective against even 25% of the devices of a specific purpose, the hacker would need to cover perhaps 10,000 different make, model, versions. Malware, like any software, needs to be tested to be trusted to do what it is intended to do. Equipping a software testing laboratory to cover such diversity is beyond the resources of almost everyone.
The most famous known incident of level 2 attacks was called Stuxnet. It attacked centrifuges in an Iranian enrichment plant. But all the centrifuges and their PLC controllers were alike. Imagine if the Iranians had 10,000 centrifuges of 1000 different makes and models.
But even if the grid itself is difficult to attack with cyber methods, bad guys could choose to attack the power plants upon which the grid depends. But in power plants, diversity applies in spades. Not only are the power plants different generations, may by different companies, but they are also diverse in type (i.e. Hydro, solar, nuclear, gas, wind, etc.) and in design philosophy of the architect-engineers. Hackers planning to bring down large numbers of the roughly 6000 power plant units in the USA will have to deal with that diversity. Saying that no two power plants are alike may be an overstatement, but not by much.
Level 3 – Area Control Centers
Inside a region are individual power utilities Each utility has a control center overseeing their own territory. The infamous SCADA systems that critics like to criticize mostly exist at this level.
Level 3’s primary functions are twofold. First, they relay the commands from Level 1 to the power plants at level 2, and they provide a backup of level 1 functions. Second, they oversee operations and when necessary, they initiate switching operations that open or close circuit breakers and switches.
Two kinds of cyber attacks are foreseeable. A DOS (denial of service) attack would seek to disable the level 3 centers. The effects of that are similar to a natural disaster that might wipe out the center so in that sense they introduce nothing new. The utilities are required to have plans and procedures for dealing with that, and they must be tested and rehearsed periodically. Part of those plans is to station people at the substations to do switching operations manually if needed.
The second kind of attack is similar to the Stuxnet virus, that seeks to inject false feedback or induce false commands. It would also be similar to the expected effects of the Y2K bug that was expected to wreak on the grid. Obstacles to that include the fact that each area is likely to have hardware and software completely diverse from the neighboring areas, and even within one center, the control functions may be divided among several systems. And finally, the defense that the operators have of shutting down the whole thing, thus falling back on the procedures that would apply if a flood or an earthquake wiped out the center.
Level 4 – Regional Control Center
In the USA, regional control centers are the highest level. We have no national or continental-wide pr national control center. They serve many functions (see my Insights Article, What Happens When You Flip The Light Switch), but two stand out.
Economic – They manage the energy futures markets, and find the least cost way to provide the energy and auxiliary services needed for the region. In an emergency, economics can be disregarded.
Security – They continuously compute hundreds of thousands of hypothetical contingencies to assure that no single failure would cause a security violation. In major cities, combinations of contingencies two-at-a-time or three-at-a-time are considered. It was the failure to fulfill this mission in Ohio that caused the cascading blackout in 2003. If this function became unavailable for a long time, the remedy is to operate the grid more conservatively, with power flows not so close to the limits. That could result in local shortages, and demand response or forced rolling blackouts, but it does not mean semi-permanent blackout. Remember that the entire set of USA grids operated with zero regional level blackouts from 1882 to 1965. Computers allow us to operate more economically, and closer to the margins. But it is false to say we can’t operate fairly without them.
Needless to say, level 4 centers get a lot of security attention including redundant computers, red team testing, and completely retro analog backups. But just in case a DOS attack takes out Level 4, the Level 3 centers provide nearly complete backups for the critical functions. Almost none of the Level 4 hardware and software is the same as at Level 3.
The Point Most Overlooked
Here is what the doomsayers fail to recognize. There is no a priori reason to assume that the size distribution of blackouts caused by successful cyber-attacks will be substantially different than blackouts due to other causes. The curve below (USA data 1984-2005) shows that distribution. The horizontal axis S is the number of customers affected and the vertical axis is the fraction of blackouts of size S or larger.
The conclusion I draw from that data is that experimentation and testing of cyber-attack tools will leave behind numerous histories of small scale successes, each warranting much publicity. Recent news gave the first documented single data point of a successful cyber-attack on any power grid. It happened in the Ukraine, but it only affected half the homes in the Ivano-Frankivsk region for a limited period of time.
IMO, Koppel’s vision in which cyber-attacks skip smaller scale events and leap directly to the most severe consequences imaginable, is wildly unrealistic and pessimistic.