The problem with backups

Legacy approaches to backup in the IT industry are largely focussed on performing a backup of an entire server. This means that multiple backups will contain a full copy of the Windows operating system, and other files that will never need to be restored. Follow this article for cloud backup tips.

In the traditional approaches of GFS (grandfather, father, son) retention, and differential, incremental, synthetic full and full backups – this results in large amounts of duplication, primarily from historical technology constraints such as lengthy backup process duration, unreliability of backup or media and the impact to production systems. When services are moved to the Cloud, many of these restraints become invalid and need to be re-evaluated.

Key Objectives of backup

The fundamental requirement of backup is to achieve recovery. If a backup cannot be recovered, it is useless and the effort and cost to perform the backup is wasted. Recovery can be categorised in multiple areas;

  • Recovery of data, including databases, audit log entries and configuration
  • Recovery of files
  • Recovery of systems, such as servers (IaaS) and technologies (PaaS & SaaS)
  • Recovery of services, including SaaS

These different requirements for recovery will inform the method taken to perform backups.

Backups are also required for retention of data, including the ability to return to previous versions and dates. Depending upon the classification of the information, data retention periods may vary.

cloud backup tips

Approach alternatives to backup

Backup approaches must be taken with a close eye on the eventual requirement to be able to restore the backup. For that reason, there are multiple alternatives for achieving the requirements of backup.

  • Code backup – such as automation scripts, configuration [customisation] of SaaS/PaaS platforms, API implementations and code/scripts used in consuming services etc.
  • Off-system data streaming – such as log aggregation (e.g. SIEM), transaction log shipping, etc.
  • File retention and versioning – including systems that retain delta copies (changed components) and permit version control
  • Snapshots and images
  • Application controlled backups – such as SQL backups or other application controlled backup mechanisms
  • Export of data to an archive

Traditional approaches to backup in the past had taken a focus on; availability, retention, disaster recovery and compliance

Technical considerations

For each type of system, there will be different considerations to be made, including;

  • SaaS platforms may not allow 3rd Party backup to be taken
    • Restoration to SaaS platforms may not be supported by the vendor
    • Restores of SaaS and PaaS may interfere with production due to embedded configuration or references to production resources
  • Testing of restores may not be possible to SaaS or PaaS systems
    • Some restores may be an “all or nothing” approach
    • Non-production instances may not be available in all SaaS / PaaS offerings
  • Data sovereignty (is data stored in Australia)
    • And other compliance requirements including, PCI-DSS, FOI [and SOX/GDPR type of regulations that we may become subject to in the future]
  • Data transmission (ingress / egress) costs and bandwidth limitations
  • Data storage costs for backups (including the retention of data for extended periods)
  • Encryption of data in transit and at rest over public networks and repositories

Cloud backup tips

Take into account the following cloud backup tips when you are planning for cloud backups.

Usability of backups

A backup must be able to be used when it is restored. Therefore, the following need to be considered;

  • For a SaaS platform, a backup must allow individual files or records to be restored or accessed, to an alternative location other than the production system
    • For example, an email or attachment must be able to be restored without needing to be restored to the original mailbox in the active mail system
    • A record in the CRM system must be able to be restored without over-writing an existing record, or damaging audit logs or integration records
  • If an application container (Docker, Kubernetes, etc.) is restored, this must be possible in a sandpit or isolated environment that will not try to interact or join a cluster in the production environment.
  • The operating system and application versions must be able to be either installed or restored for the application data to be recovered. This may also need to take into account patches and modifications.
READ ARTICLE:   The era of infrastructure DIY is dead

Historical versions, snapshots and version history

It is a requirement that the backup approach is able to access data from previous versions or points in time. This is required so that deleted or changed information can be recovered, without damaging live or active data. This is more than just restoring to an alternate location, it must also be able to be restored in an isolated location.

For this reason, automatic versioning and snapshot approaches may not meet the requirements for backup, if the only way they work is to restore over the top of current and live data. If versioning or snapshot recovery only allows renaming of the restore to the original location, this may cause problems where there are integrations or embedded configuration within the restored data that may affect a production data instance (for example; a document that updates external data sources, a snapshot that attempts to authenticate with out of date credentials when starting, and so causes an account lockout etc.)

Backup must have an index and records that allow an administrator to review and select a backup from a previous time. This is to allow a record/file/configuration from weeks, months or even years before to be recovered.

SaaS solutions

Where Software is provided as a Service, the customer has less access to the underlying system. There are also limitations in being able to restore – if the SaaS provider is unavailable, then the backup could be useless if it needs to be restored to the [unavailable] SaaS provider. For example, a backup of SalesForce data cannot be restored to an on-premises instance or to another provider, as the SalesForce system is only provided by one company and only exists in the cloud.

In another example, if there is a global denial-of-service (DoS) attack on Microsoft, then a backup of Microsoft Exchange Online must be able to make email content accessible from a backup system that is not hosted within the [extensive] Microsoft cloud ecosystem.

There may be SaaS or PaaS solutions that have tight integration to other services, and if these are unavailable, it may prevent the restored data from being accessible – for example a historical restore of a system instance that was integrated with a SaaS solution that has been changed and no contract exists – would this prevent the restored instance (e.g. a container) from being startable and usable?

READ ARTICLE:   OK, Google. Take a note

For this reason, particular focus – customised for the SaaS provider – must be made on the usability of restored data, how it is stored and accessed, and business continuity.

Backup of SaaS solutions, and the availability and retention of the data held in it, should be considered in contract negotiation and selection of SaaS products and their version/level.

Pragmatic solutions for IaaS and PaaS

With a focus on backups that provide the ability to restore usable data, from any time in the past and to a separate location, there is a requirement that a practical approach is taken to backup, to ensure that objectives can be met, and these cloud backup tips will help.

A backup for an IaaS solution may constitute an [commented with meta data and version controlled] instance of the build script existing in the Devops code repository, in conjunction with an application-level data backup that is exported to an object bucket (such as S3). This would not be a backup of the IaaS server, it’s operating system and the application binaries – instead focusing on the way to recover the entire server, the focus is on the ability to bring an identical server back online and provide access to it’s data.

Where there are auto-scaled or load balanced systems, not all will need dedicated backup. Providing the recovery of a system will enable service recovery and historical data content access, not all components will need backup.

Some solutions may depend upon appliances – these are systems that have minimal functionality or reduced access to the underlying operating system. These solution are often difficult to back up, but very easy to re-deploy. For this reason, the approach may be to depend upon a build document to describe how to deploy another instance, or a few scripts to apply integration, customisation, and configuration.

A backup for a PaaS product that performs data analysis and reporting may require that data is in a particular format for the PaaS tool to use – but this may not be usable in another tool. In the event that a decision is made to move to a different PaaS product to achieve the same business function, the backups must be in a format that the new product can use, or the old product will need to be maintained for the duration of the retention period. Backup of reports may be limited to the business requirement definition of the report so that it could be re-created in another platform.

Backup categorisation/type

Depending on the requirements of the system for recovery, several options are available for performing a backup. This is not a full guide, but will give you some cloud backup tips.

Snapshot / Full VM backup

The historical “standard” way of performing a backup is to take an entire copy of the server, including its operating system, application executable files, data and temporary files. To ensure that data is not in RAM and is committed to disk, technologies such as VSS and fsfreeze are used to create a snapshot of the server.

This approach to backup can be slower, take more disk storage, and take longer to restore from. However, the benefit is that a restore of the system is complete – particularly if the backup of multiple systems is “application aware” and informs the running application that it is being backed up.

READ ARTICLE:   Disasters in disaster recovery

Where there are multi-tier systems, or multiple VMs/servers required to provide the service, these must be backed up together in an “application consistent” backup that ensures [for example] that the database and application server are in sync and that all transactions are in a consistent state.

Code repository

For applications and services that have been deployed by the function of code, and if this indicates that the system can be restored to service by the function of re-running the code, then a valid backup mechanism may be to store the code/scripts/definitions in a code repository. In a DevOps approach, with automated system deployment or configuration stored in a code repository, then this is a valid approach for backup – as it permits recovery to service.

It is important in these scenarios to ensure that any unique data or transactions is captured through different means. Code repositories are primarily dependent on a system being stateless.

Valid uses of a code repository for backup purposes depend on the code being appropriately labelled, versioned and tested. Code that is appropriate for restore of services should be marked to indicate that it is the current production version, so that development code is not used for recovery.

Code repositories can store Container definitions, networking configuration, SaaS customisation or scripting, along with PowerShell or DSC definitions for Azure, and AWS Cloud Formation templates.

Application and database backup

Where the application is capable of performing backups directly, this is a preferred option to provide more reliable recovery. Backup processes within the application or service can also be triggered or managed by an external backup system that is aware of the application.

Backup of a database should be managed by the database engine, as this will ensure that pending transactions are commited, that transaction logs are purged on completion of backup, and that the database logs are aware that a backup has been completed.

This type of backup is preferred, and can be used in conjunction with other backup approaches.

Vendor Service Level or Contract

There are many types of systems in the cloud that cannot be restored from outside means. With a focus on recovery capability, it may not be appropriate to attempt to perform a backup of these systems. Instead, the capability to perform recovery and availability would turn to the use of Contracts. For a SaaS solution, the vendor must provide their own backup capability (similar to the Application Backup approach above), where the activity to perform a backup is the responsibility of the SaaS vendor/supplier, where they are contractually required to perform backups that will allow them to restore data and services within a Service Level Agreement.

It is important to note that not all SaaS providers will have the same type of SLA or services offered for recovery and availability – and this will require contract investigation, which may result in additional products or services being purchased or implemented to achieve the business requirements for retention, availability or capability.

Cloud backup tips

I hope that this has given you some cloud backup tips that can save you from some of the mistakes and myths of cloud, and allow you to recover systems in the event of a failure.

Share this knowledge