On 9 August 2019 the UK felt the impacts of what was described by National Grid as an "incredibly rare" event. The impacts were immediate: homes were plunged into darkness, traffic lights failed to work, trains were cancelled and passengers were stranded, hospitals suspended non-essential procedures and many businesses suffered financial losses.
The government has launched an investigation by the Energy Emergencies Executive Committee and the National Grid has been asked by the Business Secretary Andrea Leadsom to "urgently review and report to Ofgem".
There is little doubt that the technical detail of what caused the gas-fired power station at Little Barford and the Hornsea offshore wind farm to go, almost simultaneously, offline will be collected and analysed in forensic detail together with a scrutiny of the automatic safety systems that shut off the power to some places to protect the integrity of the grid. Likewise the communications procedures will be probed along with the ability of services such as rail transport and hospitals to survive a 42 minute power outage.
However, perhaps the answer to what went wrong lies not in the technical detail but in the fact that the risk landscape has changed and that the metrics that are collectively used to drive investment and planning for infrastructure disruptions are now reaching their limit of usefulness.
Most electric power utilities, who have long been seen as leaders in the critical infrastructure community for contingency planning, have regulations that continue to drive them with "reliability" metrics such as: number of customers interrupted; customer minutes lost; and mean daily fault rates.
Such metrics are good for normal operating conditions but they undervalue the impact of large-scale events and price lost load at a flat rate. Yet the value of lost load compounds the longer it's lost. For example, most customers will value costs differently in the first few minutes of the disruption caused by an outage, when it's merely inconvenient, than they do after days of disruption, or weeks when modern life becomes simply impossible. Likewise, the impacts of large scale events are disproportionately high, driven by abnormal restoration costs and widespread and complex infrastructure damage. Large scale events are therefore often only included in the narrative of risk registers and the reliability metrics drive a planning and investment focus on smaller, more common, events rather than larger, more uncommon, yet more disruptive events. Especially when combined with an accessibility and affordability target.
However, widespread economic instability, disruptive technologies, hyper-extended supply chains, terrorism and organised cyber-crime are now commonplace. Likewise, grid operations have increased in complexity due to changing power demand, increased reliance on renewable sources, and increasing introduction of smart technologies. Together, these have created a risk landscape that is no longer relatively stable and interspersed with occasional shocks but unremittingly characterised by uncertainty, complexity and risks with adversaries.
Low-probability, high-consequence events are now much more common and energy researchers such as Vugrin, Castillo and Silva-Monroy from the Sandia National Laboratories have recognised that historical data used for reliability calculations may not be suitable for characterising future potential outages because emerging threats can differ significantly from historical precedents.
The concept of "resilience" in complex socio-economic systems reliant on technology is not new, but it is something that is hard to regulate as it is subjective and involves the combined effort of technology, systems, people, processes, leadership and culture.
However, if we are to avoid more disruptions of the type we saw last week then we need to change the way we incentivise infrastructure investment. Rather than simply promote grid reliability, that focuses effort on preventing a disruptive event from occurring, we also need to promote energy sector resilience to ensure that power generators, distributers and those organisations such as transport and health sector organisations that convert power into citizen services, can continue to provide goods and services to the communities that rely upon them, regardless of the occurrence of disruptive events.
To find out more about managing uncertainty and promoting resilience within complex socio-economic systems please contact Sungard Availability Service Resilience Consulting practice.