How to Develop a Disaster Recovery plan with AWS

How to Develop a Disaster Recovery plan with AWS

Last Updated on January 1, 2021

In order to ensure the continuity of business and minimise downtime as much as possible, it is extremely important to develop a disaster recovery plan as sticking to old and traditional on-site disaster recovery policy is surely not going to work anymore. Using AWS automated disaster recovery site is the new thing as this is an extremely cost-effective and efficient solution to disaster recovery management. Simply being on the cloud will not solve your purpose but there must be a concrete plan that must be executed perfectly and then the business continuity would be ensured. Below is the step-to-step guide as how a disaster recovery plan must be developed with AWS

1. Identifying your infrastructure

Your infrastructure could be huge and varied and it is very important to identify them on the cloud. Your infrastructure contains network, storage, hardware, software and many more things and if you have a cloud infrastructure on AWS you must use AWS Tag Editor to identify all your resources and properly map them in order to cut off some of the additional work and save time and efforts. Remember, nothing must be left and everything must be identified in your infrastructure as a plan

2. Collaborate with the whole development team

Always remember that the people involved in the development must always be consulted and considered whether you have an in-house team or have hired a team of developers as a service from outside. All the dependencies that exist among all the elements of your infrastructure must be clear and defined. Apart from services and in-house developers, all the third party resources like Google maps API must also be considered.

3. Each element in the infrastructure must be given its due importance

Rather than giving importance to the information of the customers who are sort of dead for you, it is obviously important to focus on the current customers and their interaction history. While recovering the elements in the given or decided time frame, these elements matter a lot and this is the reason this prioritization must be done. Those elements that can be recovered slowly will definitely cost you less.

4. RTO and RPO have their own importance

Recovery Time Object and Recovery Time Objective are the two important things that must be discussed with the stakeholders. RTO is the expected time of an application when it is back up and running while RPO is the previous point where it will be taken once the application is recovered to its previous state. Rapid RTO leads to costly recovery so it is very important to maintain a balance between the wants and needs of the infrastructure.

5. Use gathered information for creating plans

There are many simple and complex plans for recoveries made by the management depending upon their infrastructure and the needs of the organization but it is very important to make use of all you have in a very efficient way. All the cloud-based tools that are available to you must be used to their full potential and AWS provides flexible solutions that can be customized as per your needs.

6. Take care of the in-house communication network

There are plenty of IT infrastructure monitoring tools that will monitor the failure in-house like server parameters and alerts. One such tool is AWS CloudWatch that helps you monitor CloudWatch events and Lambda. You can also re-assign developers to monitor your infrastructure round-the-clock and execution of your DR plan when things go wrong all of a sudden in the environment. But this will lead to your core team being taken out from their core tasks and assigned to something else which might increase your cost as the team might have to be paid overtime if things continue to remain messy on weekends as well so an AWS professional services consultant will definitely warn you regarding this. 

7. Test and be assured

Planning your test is one of the most important things you must do and ignoring it might land you in trouble. All the DRs must not only be planned in advance but it must be ensured that those measures work out immediately as soon as a disaster occurs. For this, AWS provides some tools which have the features of staging and testing your solutions. This can be done efficiently by creating a duplicate environment where the real-world scenarios can be run and tested for their efficiency on a regular basis.