Amazon Web Services (AWS) had a fairly public outage this past weekend, and much has been written about it. This event certainly impacted a lot of applications and the people who use and support them, but it got me thinking about how far we’ve come on availability as an industry over the past 20 years. When events like this occur, I would humbly inject some perspective alongside the hyperbole:
For those of us responsible for keeping applications up and running, the events of this past weekend should serve as a reminder that we need to be designing with availability and resiliency best practices. The fact is that many organizations have plenty of applications that haven’t been rebuilt to take advantage of some of the latest public cloud functionality and documented best practices. This takes new approaches to infrastructure and applications using new technologies, processes and skill sets. Practically speaking, we simply can’t re-factor all of our apps overnight, so we have to assess the value of availability for each application and make prioritization choices. We must also transform our processes and the skills of our teams simultaneously. For those managing large enterprise application portfolios, this can be a daunting task.
The size and scope of the challenge isn’t an excuse to not get started. To do our part, we’ve been working on ways to make applications more recoverable at public clouds like AWS, using functionality such as RDS read replicas, Route 53, CloudWatch, DynamoDB streams, CloudFormation, EBS, S3, Glacier and more. AWS re:Invent is coming up October 6-9 and Sungard Availability Services will be there to share more about some of the exciting things we’re doing on AWS. Please stop by the booth, say hello and find out more.
Thanks for reading!