Fortune 500 Company - Disaster Recovery Case Study


| Challenge

The fortune 500 company’s automated mobile workforce management and service optimization solution’s business critical services run on 600+ EC2 instances, 25 databases and domain controllers. 400+ of these EC2 instances are in the US: West regions and 200+ instances run in Europe regions. The customer’s team of 5 CloudOps engineers were tasked with delivering <1 hour Recovery Point Objective (RPO) and <4 Hrs Recovery Time Objective (RTO). This is in addition to managing compliance, security, governance, and cloud operations of their entire AWS footprint that constitutes 10,000 EC2 instances and dozens of other AWS services. Linearly growing the team size was not an option.


| Solution

MontyCloud DAY2™ helped the customer achieve high resiliency for their mission critical multi-tier applications by automating their Disaster Recovery process with an DAY2™ autonomous BOT. It is simple to deploy and configure the DAY2™ Disaster Recover BOT to protect their EC2 based workloads. The DAY2™ Disaster Recovery BOT uses AWS native technologies such as CloudEndure and AWS Systems Manager Agent, CloudWatch and Automation services to automatically:

  • Enroll the business critical EC2 instances into the DAY2™ Disaster Recovery BOT as they are created
  • Enforce snapshot policies, manage replication, and check consistency
  • Ensure the Elastic Load Balancers, Security Groups, Domain Controllers and IP addresses are configured and ready
  • Monitor heartbeats and send a slack notification asking for approval to start a Disaster Recovery process
| Outcome

With MontyCloud DAY2™ Disaster Recovery BOT the customer exceeds RPO/RTO expectations without expanding their team. With the MontyCloud Disaster Recovery BOT the customer achieves:

  • < 1 minute Recovery Point Objective (Vs their goal of 1 hour) for all 600 EC2 instances across 2 AWS regions
  • < 1 hour Recovery Time Objective vs. their goal of 4 hours
  • Successful Disaster Recovery tests at least once every 24 hours and sometimes several every day
  • DAY2™ Disaster Recovery BOT is eliminated manual validation of successful fail overs and keeps teams notified on status
  • Solution eliminated the need for custom scripts, proprietary third-party agents and manual intervention and helped customer refocus their talent in other growth areas




Software Technology – Sales CRM SaaS


North America




Fortune 500 company with tens of thousands of enterprise customers worldwide enabled one-click cross-region disaster recovery (DR) for multi-tier enterprise application running on 600+ Amazon Elastic Compute Cloud (EC2) instances across 2 regions in US and 2 regions in Europe.



"Our customer went from days of RTO to exceeding their expectations of 4 hr RTO with AWS native technologies. One of the benefits of a BOT based automation using AWS native technologies is that our customer is now able to perform DR tests on demand, sometimes several times a day. The only thing manual about this solution is the approval on Slack®."


Delivery Architect at GreenPages | MontyCloud’s Solutions Partner

Download Case Study