Avoid Memory Issues with Veeam and ReFS

There have been reports of issues that users have run into after incorporating ReFS repositories with Veeam Backup and Replication. Here’s how to best avoid running into them:

According to reports, users are running into issues on Windows 2016 servers using drives formatted with ReFS while running Veeam synthetic operations. So far this issue has been primarily reported by users who have formatted their ReFS volumes using a 4K block size, which is set by default when formatting a new volume. Veeam spokespeople have recommended that users use a 64K block size as the primary method to avoid this issue. 

Check the current allocation unit size of ReFS volumes using:

fsutil fsinfo refsinfo <volume pathname>

Microsoft has also released a patch(KB4013429) as well as a corresponding knowledge base article regarding this issue. The fixes include a patch + registry changes specific to the issue with ReFS. The patch adds the option to create and fine tune the following registry parameters to tweak the ReFS memory consumption:

RefsEnableLargeWorkingSetTrim | DWORD
RefsNumberOfChunksToTrim      | DWORD
RefsEnableInlineTrim          | DWORD

The reason for the errors during the synthetic operations performed during Veeam Backups to ReFS repositories is that Veeam synthetic operations are very I/O intensive. Users have uncovered an issue with the ReFS file system where metadata stored in memory is not released properly. This causes the utilized memory on the system to balloon and can eventually lock up the OS.

Windows SysInternals RAMMap
Image Source

Using the Windows SysInternals Tool: RAMMap, users can monitor the memory usage during synthetic fulls. This will help determine if the metafile is growing and if there will be a potential issue with memory.

Finally, suggestions for avoiding running into this error:

  • Choose 64K Block Size when formatting new ReFS volumes for Veeam repository. Avoid using 4K Block Size for now.

If you are currently using an ReFS volume with 4K Block Size consider migrating the repository to a new volume with 64K Block Size. This post may assist you.

  • Use more than the minimum recommended memory for Veeam Repositories, that is, 4GB plus up to 4GB for each concurrently running job.
  • If you are currently using ReFS with 4K blocks and/or are running into issues with you Veeam repository locking up during synthetic operations, apply the patch + corresponding registry changes. If these do not resolve the issues, try adding more memory to the Veeam repository server.

64K block sizes are already largely recommended as a best practice for Veeam repositories, considering how Veeam works with large files. The issue here being that Windows sets the default allocation unit size at 4K so users may skip past changing it when formatting new volumes. Hopefully future releases of Veeam will be able to detect and warn users against 4K block sizes during the creation of ReFS repositories.

Update 11/3/17: Some users with larger amounts of backup data report issues even when using 64K block size yet.

More current updates of Windows Server 2016 now include additional registry settings to curb some of the issues that users have continued to report. As of the time of this update, it has been suggested that those experiencing these issues set these decimal values for the following registry keys:

HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsEnableLargeWorkingSetTrim = 1
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsNumberOfChunksToTrim = 32
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsDisableCachedPins = 1
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsProcessedDeleteQueueEntryCountThreshold = 512

Some Veeam technicians/architects have also suggested, for backup jobs whose retention requirements are 100 days/restore points or less, to avoid synthetic fulls unless they are specifically necessary.

eg: For a retention policy going back a month, aim for 30 daily incremental restore points, rather than 7 daily and 3 weekly restore points.

This will still provide the benefits of fast cloning from ReFS but avoid the issue of synthetic full merges potentially locking up the storage during the full backup file merge processes.

 

More Info: Veeam Forums; VeeamLiveMicrosoft KB4016173Microsoft KB4035951