An FRS customer asks about managing staging folders

A customer migrating from FRS to DFS Replication recently asked our DFS Replication PM, Shobana Balakrishnan, the following question:

For FRS we are using one logical drive (the S: drive) for all FRS replicas on the server.  The S: drive is 50 GB, which should be large enough to manage the 300+ GB of replicated folders on the server.  We currently set the staging limit to 40 GB and set the Outlog Change History in Minutes to 1440 (1 day). This has been working well for us.  Now that we are moving to DFSR, we would like to give each replica set 40 GB of quota on the S: drive, therefore having all replicas on a given server share the same 40 GB of space.  If we were to use the built in StagingHighWaterMarkPercent of 90%, it would seem that the replicas (especially the larger ones) would want to maintain about 36 GB of files in their staging area before doing cleanup, and eventually run out of the 50 GB of logical disk space because all the replicas would want to maintain 36GB each or 90% of the allocated 40 GB.  Is there a way to manage the staging directory similar to what we have done in FRS?

Here is Shobana’s response: 

Staging cleanup in DFS Replication is scoped to the replicated folder, so depending on the activity of each replicated folder and file sizes, you may want to size the staging folder for each replicated folder differently.  How many replicated folders do you have in all? I would recommend specifying n different paths on the S: drive for each of the n replicated folders and sizes as large as possible. Since you have 300+ GB, how much do you want staging to occupy – just 40GB? What are the five largest file sizes for each replicated folder? As a rule of thumb the staging size should be set to at least the size of the five largest files so that replication is not blocked on staging cleanup.  In order to prevent your staging folder from increasing and hitting disk full due to the cumulative sizes being greater than 300 GB, I would suggest placing a hard quota (you can use File Server Resource Manager in R2) on the staging top folder. For example:

S:staging -> Quota of 100 GB

S:stagingFolder1 -> size of 10 GB assuming the 5 largest files expected < 5 GB. When the folder reaches 9 GB, staging will be cleaned up

S:stagingFolder2 -> size of 10 GB

S:stagingFoldern -> size of 10 GB

Note that in this example if n is > 10, you will likely hit disk_full (due to the quota) that will cause clean-up of all your replicated folder staging paths to the low watermark or half the previous size, which ever is lower.

I would suggest keeping your staging sizes as large as possible, especially on hubs that have to server more than one partners on different schedules, to prevent restaging overhead.  Note that the RDC hashes are calculated on the staged file and saved as an alternate stream on the file, so keeping more staging files around is beneficial for DFS Replication, especially if you are using the cross-file RDC feature. Also note that unlike FRS, the staging directory is used both on the outbound and the inbound