By Haim Glickman
In recent weeks, we’ve seen several IT failures that left thousands of customers frustrated across the country.
First, Cisco Webex experienced a complete outage, and users were still experiencing intermittent issues 24 hours later. The interruption was apparently caused by a rogue script that began deleting the virtual machines hosting the service. As Cisco put it, “This was a process issue, not a technical issue.”
Then Verizon experienced voice, text, and data service interruptions for several hours affecting states across the South and Midwest, while also stretching into the northeast. The outage appeared to last about three hours.
To cap it off, a “technology issue” temporarily grounded Delta aircraft nationwide in the latest airline outage. Tweets from the company said the “computer tracking system” was down, and that the issues were system-wide. The outage lasted for at least an hour.
For all three companies, customer complaints spread quickly on social media, reinforced by media coverage. As we evolve with various technologies in a super-fast technology world, we expect and demand zero interruption and 100 percent connectivity.
Not to mention that outages can be terribly expensive. Based on survey data, Gartner uses $5,600 per minute as a gauge of the cost of network downtime. For these recent outages, that’s a significant hit to the bottom line in a short period of time.
Why customers are less and less patient with outages
Companies are expected to architect their offering in a higher availability state than ever before – active-active is not even good enough without another layer of redundancy.
With Cisco Webex’s communications tools -- Calling, Meetings, Control Hub, Hybrid Services, and Team -- all suffering issues, global organizations were stifled without being able to hold meetings and collaborate in other ways that global organizations have grown accustomed to. No company can afford to lose that much productivity.
That was the big issue for Verizon customers as well. Three hours without a working mobile phone, as some Verizon customers experienced, is becoming harder and harder to manage today as most of us go through our daily lives with constant connectivity to the world. Our phones help us communicate with family, friends and co-workers, as well as remotely manage security and other systems at our homes – and so much more. Sever that connection and be prepared to feel the heat.
Delta’s outage is just the latest in an airline industry that every other week seems to experience a breach or interruptions. In the past, interruptions were rare catastrophic events. Now, especially in a connected world, consumers expect zero interruptions and on time travel.
How to prevent your company from becoming the next outage headline
The bottom line is that companies must take their security and availability more seriously as organizations and the public become increasingly reliant on technology-driven services.
As IT grows more complex and capable, it creates more opportunities for failure. Which is why every company should be testing to prevent the scenarios these companies experienced before they happen and put in place stronger failovers and recovery plans when – not if – they fail.
It’s always better to fail in a test and be prepared than to alienate customers and land negative stories in the national news.