If you believe your traditional disaster recovery plan will help recover your data after a cyberattack, you might want to think again.
John Beattie, Principal Consultant at Sungard Availability Services, joins IT Availability Now to lay out the facts and offer the best approach for recovering compromised data after a successful cyberattack:
- The differences between compromised data recovery and traditional disaster recovery
- What strategies to put in place before a cyberattack
- How to create a response team and ensure it’s prepared to act
Brian Fawcett is a Senior Manager of Global Sales Engagement at Sungard AS. With over 15 years of experience in a range of industries, he specializes in forming enterprise-wide global talent and learning development programs. Brian has enriched corporate learning culture by matching organizational vision and core values to curricula, leading to application and impact.
As a Principal Consultant at Sungard AS, John Beattie works closely with organizations to implement third party risk management programs, and reduce operational risk by establishing new business continuity and disaster recovery programs or transforming existing ones to improve effectiveness.
The full transcript of this episode is available below.
Brian Fawcett (BF): If you think your disaster recovery plan will help recover your data after a cyberattack, well you have some work to do. I'm your host Brian Fawcett, and this is IT Availability Now, the show that tells stories of business resilience and from the people who keep the digital world available. We all know we should be protecting our data from potential cyberattacks, but equally important is preparing to recover compromised data following a successful ransomware or wiperware attack. And it's not as simple as you might think. That's why we're talking to John Beattie, Principal Consultant at Sungard AS, about how to prepare. John, welcome to the show.
John Beattie (JB): Thanks for having me, Brian.
BF: So John, a cyberattack can be looked at as a special type of disaster situation, but from what I understand your traditional disaster recovery plans might not be sufficient for successfully recovering compromised data after an attack. Tell us, why is that?
JB: Well at a basic level, it’s because a cyberattack simply is a different recovery case. Traditional disaster recovery plans and capabilities focus on a physical infrastructure loss, and most of the time, that's unharmed in a cyberattack situation. But recovering compromised data requires different response and recovery strategies, and if you don't recognize and plan for the differences, it'll be more difficult and time consuming to recover your data, especially when you're under stress and under the microscope.
BF: So you said if you can recognize and plan for differences, what are some of the differences companies should be planning for?
JB: Well there's actually four major differences, Brian. First, the triggering event - disaster recovery plans focus on recovering a physical data center disruption. Your infrastructure has been compromised in some way, your applications are unavailable, your network may be down. But data recovery, as its name implies, focuses on recovering data that has been compromised by the attack, regardless of where that data is located. And there's a production impact - in a physical disaster, you utilize data that's been backed up into your recovery environment. In essence, you're standing up a new production environment that is either in place, or readily stood up for this particular purpose. In a data recovery effort, however, you typically are doing a recovery in place, meaning that data stored elsewhere, once it proves the malware free, is brought back into your original production environment on to clean hardware, and that's hardware that's been rebuilt to ensure that it's malware free. Disaster recovery and data recovery also focus on different data. In disaster recovery efforts, you typically use the most recently replicated or backed up data that's already in place. But, if your data has been compromised in a cyberattack situation, your replicas are typically also compromised, and potentially multiple generations of your backups have been compromised as well. So you'll be looking for the most recent malware clean data you can find, and that may actually go back, days, weeks or in some cases even months. And finally, in a data recovery situation, you're less likely to hit your disaster recovery-focused objectives. In disaster recovery, you've probably successfully been testing your plan, you've hit your RTOs and your RPOs, and you have a good sense of whether you'll be able to achieve them, and you'll be able to execute according to a well-defined script. But, with data recovery, you need a lot more time to understand the nature of the attack, what's been impacted, what are the data synchronization points, and you're going to rarely meet your RTO’s is the RPO's, and it's really impractical for the business to expect that.
BF: Thanks for going through those four major differences and clearly the data recovery is a different recovery case that requires a different approach. What do you need to do differently to successfully recover your compromised data?
JB: Well the first thing to understand is that every data recovery situation is going to be unique. You just can't have a cookie cutter response as you can with typical disaster recovery. You have to probe deeper and ask additional questions before finalizing your tactical data recovery plan. So for example, do any production machines need to be rebuilt, replaced and rehydrated after the cyberattack? Should you be using new servers that were never even on the network? If you were hit by ransomware attack, are you going to be paying the ransom? Will you be doing your recovery efforts in parallel with negotiating with the bad actors who perpetrated the attack? And then there's also the challenge of keeping unimpacted production running, while you're recovering the compromised data. You also have to have a handle on what you consider to be your most vital data assets. These might be different from what's considered top tier or critical in your DR plan. We often find that there is data in many organizations that is very vital, or if you will, central to the organization, but is simply not covered by their disaster recovery program. So you'll need a plan to recover that vital data, both efficiently and effectively. And in order to do that, you may need to make some extra investments to improve your odds of having clean data to use for recovery purposes.
BF: So where do these considerations start? How can companies prepare for a cyberattack that compromises their data?
JB: Well many organizations are very familiar with what's called the 3-2-1 recovery architecture, and we've adapted that concept into what we call 3-2-1-1. Let me kind of explain what that means, because if you have this in place, we believe that you have much better odds of being able to recover your data effectively and be able to effectively protect it. So there's really three areas of separation, that's where the “3” comes from: people, process and technology. In terms of people, that means having separate people involved, and having access to your data. So different folks have access to production, than have access to the backup data that you might be called upon to use. So you want to be able to protect that from a rogue employee or compromised credentials disrupting your program. For process, it's about having separate backup processes and schedules. You want that to vary. You don't want that to be predictable, you don't want the bad actors being able to get in there and understand how you're doing your backups and what your schedule is. Then, of course, there's having separate backup technologies and separate backup locations. So you want to make sure that you have data spanning multiple locations so that you have, again, better odds of being able to find clean data when you need it. The “2” in the 3211 model, that refers to your recovery strategies. You need a data recovery strategy to backup and restore data, and you also need specific strategies that are more focused on recovering applications, like you do in your DR program. So data recovery and application recovery, simply are two different scenarios and you need certain strategies to reach. The first “1” in the 3-2-1-1 model refers to maintaining, at minimum, one off-network immutable copy of your data. That gives you your best odds of being able to find clean data at the time that you need it. Think of this as kind of a data vault, if you will, where data is placed in there, and it simply does not change, and the ability to change it has been significantly locked down. The final “1” in the 3-2-1-1 model means that you need to maintain an off-network secure environment, so that you can analyze data and make sure it’s clean before you repatriate it back into your production environment. It's very important and we've seen many data recovery efforts fail because corrupted data, if you will, compromised data, is once again brought back into the production environment because it wasn't adequately analyzed in a safe space.
BF: So, in the aftermath of an attack, who executes the plan, because you mentioned having a separate backup team, so who all is involved?
JB: Well there's many different disciplines that are involved and they all have to work together as a single team when you need to do a data recovery, and typically there's a separate plan that's put in place by most organizations in order to make that happen effectively. Responsibility usually lies with your disaster recovery team to manage those efforts, but of course well supported by your information security specialists, because those are the folks that are responsible for verifying that candidate data that's targeted for repatriation is clean, and advising you on how best to ensure that the hardware is malware free as well. You also need your infrastructure and operations teams involved because they need to make sure that they can recover the most clean backups, validate restore points, and make sure that all the data that needs to be synchronized is synchronized before you start up your production environment again. And the business continuity team is also very important in this process. They have to have strategies in place in case you experience an extended outage, as you recover the data, or if you're going to permanently lose more data than the RPOs allow. The business needs to be ready to be able to deal with that loss of vital data.
BF: Thanks for going through the backup team, that is really helpful. Obviously each group has a different focus, so what's your advice for getting them all working together as one?
JB: Well, just like in disaster recovery, it’s important to test. And it's important to test the various scenarios that could potentially compromise your data so that everyone develops that muscle memory, if you will, for how to respond quickly and efficiently by also verifying that it's possible to recover data, and to identify clean data using all the capabilities that you have in place.
BF: Data loss, or corruption, is many companies' worst nightmare, but with the right precautions and the right strategies, you should be able to recover. Recognizing the difference between compromised data recovery and traditional disaster recovery use cases is the first step to ensuring you're prepared to recover from a variety of cyberattacks, from ransomware situations to malware corrupting data. By implementing the 3-2-1-1 data recovery architecture, preparing various teams within the organization who will need to respond, and consistently and regularly testing your plan, you'll be much better positioned to recover compromised data following an attack. John, thanks for joining us today.
JB: Thank you Brian, it's great to be here.
BF: John Beattie is Principal Consultant at Sungard AS. You can find the show notes for this episode at SungardAS.com/ITAvailabilityNow. Please subscribe to the show on your podcast platform of choice to get new episodes as soon as they are available. IT Availability Now is a production of Sungard Availability Services. I'm your host Brian Fawcett, and until next time, stay available.