Just after 6.00am on Sunday 11 December 2005, an explosion of monumental proportions occurred at the Buncefield Oil Depot, close to Junction 8 of the M1 motorway in Hemel Hempstead. The depot was one of the largest fuel supplying facilities in the country and stored aviation fuel as well as petrol and other oil-based substances. The explosion was later considered to be the largest ever in peacetime Europe and, for Northgate Information Solutions, it resulted in what may be the biggest and most complex business recovery project ever undertaken.
The headquarters building of software and IT services company Northgate Information Solutions was one of the closest to the source of the explosion. It suffered extensive blast damage and several parts of the building were set on fire. Northgate had invested significantly in business continuity (BC) and high availability provision. Two backup generators were installed on site and full UPS protection and multiple communications networks were in place. But the entire on-site BC infrastructure was destroyed as a result of the explosion.
However, it’s precisely because we handled a crisis so successfully once that we are expected to be even better prepared for the next one. It’s effectively raised the bar.
Entire BC infrastructure destroyed
This created a major business continuity problem, not just for Northgate but also for the company’s clients. Northgate is a leading UK supplier to the human resource and public service markets and works with approximately 90 percent of UK local authorities, all UK police forces as well as being active in the education, utilities and corporate sectors.
Northgate’s clients outsource various critical processes to the company; including payroll, web services and data processing. All of these were crucial services, none more so than a £1.4 billion payroll run that was due by the end of December: if this was not performed the citizens within four London boroughs and the staff of 186 payroll clients would be personally disadvantaged at a critical time of the year.
Northgate immediately placed Sungard Availability Services on standby and convened a teleconference with the Emergency Recovery Team and the company’s executive. The decision was taken to meet at Northgate’s Holborn office at midday; from here the team would manage and coordinate resources in line with the business continuity plan.
Systems and assets actioned for IT and Workplace Recovery
Having been invoked, Sungard AS started to make systems and assets available for both IT and Workplace Recovery and by the following morning, a team of over 100 project and technical staff commenced the recovery of customer systems within Sungard AS’s London Technology Centre and other Northgate designated recovery locations.
The first step was to perform a triage and determine recovery priorities, with customer systems being prioritised over Northgate’s own internal systems. The scale of the recovery soon became apparent: a total of 212 systems relating to 209 customers needed to be re-established.
Another urgent issue had to be quickly resolved by the Emergency Recovery Team; how to accommodate Northgate staff displaced by the loss of the headquarters building. These staff were temporarily relocated to other company sites while alternative accommodation in the Hemel area was located.
A shift working system was established with recovery staff working back to back 12-hour shifts across 24 hours; and the recovery team now settled into a daily routine. For the first month of the recovery, a conference call was made at 5.00pm each day when the Emergency Recovery Team gave an update on activities and set priorities for the coming 24 hours.
After the first week, the majority of production services had been restored enabling core activities to recommence. The crucial Christmas payroll run was conducted on time – a great achievement given the scale of the difficulties surmounted to get to that point.
Since what may well be the largest and most complex recovery ever undertaken – an achievement acknowledged with the accolade of Most Effective Recovery of the Year in the 2006 Business Continuity Awards – Northgate has not only survived but thrived, doubling in size three times in the ensuing years.
Business continues despite massive explosion
- Ability to continue providing critical outsourced processes to customers
- Preserved reputation for reliability
- Prevented breach of contractual agreements leading to financial loss
Under the leadership of David Tate, the company’s strategy has evolved to take into account the lessons learned from its experience:
Splitting the risk
The majority of company departments and Northgate’s IT infrastructure had been based at a single head office site. This also housed the BC infrastructure – two backup generations, Uninterruptible Power Supply (UPS) and multiple communications networks – which were completely destroyed. In common with many veterans of large scale or wide area incidents, Northgate’s strategy today is to split the risk and it now has a number of data centres across the globe.
Streamlining the process
Previously, Northgate maintained separate BC and disaster recovery plans based on denial of site for all of its locations, regardless of their size or importance. This resulted in more than 450 plans. Northgate recognised that a more streamlined solution would better meet the planning needs of its fast-growing business and a review was undertaken to assess the criticality of key processes based on three operational criteria: their ability to perform a customer contract, collect cash or sell. Consequently, Northgate now has a plan for each business unit to ensure continuity of site, technology and people.
Surviving disruption on the scale of Buncefield so effectively that it was able to keep its client commitments is sound testimony to Northgate’s BCM practices. David Tate explains, “It gives clients a sense of comfort that we’ve been tested and didn’t let them down. But we’re not resting on our laurels. The company is more than six times bigger now and we face new business challenges.”
He adds, “However, it’s precisely because we handled a crisis so successfully once that we are expected to be even better prepared for the next one. It’s effectively raised the bar.”