Recovery
Availability-Zone Failure
In the event of an availability-zone failure, temporary traffic disruptions may occur. The application tier autoscaling group will recreate the number of instances required to handle all incoming traffic in <15 minutes.
Region Failure
The architecture outlined in this guide does not natively support multi-region operation. You must have enabled the optional cross-region replication of backups to restore in a different region.
Recovery Time Objective (RTO) and Recovery Point Objective (RPO)
Deeploy is mostly a stateless application. The business critical data will be housed in Amazon RDS for PostgreSQL and Amazon S3. For those applications we highly recommend to do daily backups. See the next section for more information. A <4-hour recovery time objective (RTO) and <24 hour recovery point objective (RPO) are generally possible. To restore operations:
- Recreate the (Stateless) EKS cluster as described here and install the Deeploy stack as described here
- Recreate the RDS database for PostgreSQL as described here and restore the RDS backup as described here.
- Recreate the S3 storage as described here and restore previous version as described here.
Backup
We highly recommend to do daily back-ups and snapshots to ensure minimal loss and downtime for the RDS and use versioning for S3. Please visit the following links for details:
Maintenance
Regular updates should use rolling updates as provided by our updated Helm Chart as described here. We recommend to regularly check for updates and check the release notes. Updates can include changes to the infrastructure as well. Instructions and templates will be provided in that case.
Comments
0 comments
Please sign in to leave a comment.