Amazon cloud storage, S3 and Glacier, are top contenders when it comes to storing data offsite and Veeam is a behemoth when it comes to VM backup; you can use the two together but should you? The answer: well, it depends.
Integrating a Veeam Backup and Replication configuration with Amazon storage is done through an AWS storage gateway. An appliance that sits between your Veeam server and the AWS Cloud. You can connect this gateway to the Veeam server as a file server, direct attached storage, or a virtual tape library.
With the amazon storage gateway as a file server or direct attached storage you can backup directly from Veeam to the storage gateway. You can perform incremental backups this way, but in order to avoid the corruption possible with long backup chains periodic fulls will need to be taken. Synthetic operations are possible in this configuration, however without a proxy on the AWS side anytime they are performed Veeam will need to read the full backup and incrementals stored in the AWS storage, effectively downloading it from the Amazon servers, performing the synthetic operation locally, and re-uploading new synthetic data. This causes the synthetic operations to take incredibly long times to complete and is not recommended.
The other option is the virtual tape library(VTL) which allows you to present the storage gateway to Veeam as a tape server. This lets you to use Veeams tape backup jobs to create virtual tape backups on AWS storage. While Veeam tape backups allow for incremental backups, this method also requires periodic fulls to be taken any time a new tape is required. Which may end up happening frequently, depending on backup size and retention, as the maximum tape size for AWS storage gateway is 2.5TB. For restores requiring already archived/removed, it can take up to 24 hours for a tape to be retrieved, for a cost, and made available again in the tape gateway.
Alternatively, Veeam has an offsite backup method built into its distribution that comes in the form of Veeam Cloud Connect. Cloud Connect partners are third party Veeam service providers running Veeam cloud gateways and repositories who supply you with the necessary storage and compute tailored to performing offsite Veeam backups.
End users put in their service providers information and login information and are provided with a ‘Cloud Repository’ in the Veeam console. They can send backups, backup copies, etc. to the cloud repository.
Instead of having to perform periodic full backups, tenants can perform forever forever incremental backups with periodic synthetic fulls. Because the service provider is also running Veeam and has the compute required for synthetic operation, synthetic fulls and merges are able to be performed locally at the remote site—significantly reducing backup windows.
With a pre-configured appliance for cloud connect available from Veeam, Azure may be a better option compared to AWS. This lets you setup and manage your own cloud connect server in the Azure cloud to point offsite backups to. However, this adds another layer of complexity/management as well as costs for not only Azure storage but also hourly compute costs as well as additional Veeam Cloud Connect licensing. It is certainly more feasible to look for an already available Cloud Connect partner who already manages their own offsite Cloud Connect server. A Veeam Cloud Connect partner can provide the specialized backup service, offering true incremental forever backups without the complexity of creating your own service.
As far as using AWS for offsite backup with Veeam it is certainly doable, however due to the fact that full backups are required regularly it could only be recommended for those with less than 1 TB of data to backup, those with low frequency backup who can afford extremely large backup windows, or those with a lot of bandwidth available to the amazon storage servers.
Otherwise it may be more feasible to go with a cloud connect provider who can offer nightly incremental backups with regular synthetic fulls, significantly reducing backup times because they can perform the necessary synthetic operations on the remote repositories. It’s not only more time effective, but also often times more cost effective, and simpler from a management perspective.