DFS on VMM Library

We have seen some questions around DFS and VMM Library. Here is an attempt to explain how VMM would behave when deployed in such an environment and what customers should expect before doing so.

 

First of all, what is DFS? DFS stands for Distributed File System. The Microsoft DFS solution is composed of DFS Replication (DFSR) and DFS Namespaces (DFSN). They are essentially a set of client and server services that orchestrate distributed SMB file shares into a distributed file system, and they provide location transparency and redundancy to improve data availability.

 

Here are the definitions from TechNet:

· DFSR - DFS Replication is a new state-based, multimaster replication engine that supports replication scheduling and bandwidth throttling. DFS Replication uses a new compression protocol called Remote Differential Compression (RDC), which can be used to efficiently update files over a limited-bandwidth network. RDC detects insertions, removals, and re-arrangements of data in files, thereby enabling DFS Replication to replicate only the changes when files are updated. Additionally, a function of RDC called cross-file RDC can help reduce the amount of bandwidth required to replicate new files.

· DFSN - DFS Namespaces, formerly known as Distributed File System, allows administrators to group shared folders located on different servers and present them to users as a virtual tree of folders known as a namespace. A namespace provides numerous benefits, including increased availability of data, load sharing, and simplified data migration.

 

Secondly, VMM’s support statement is that we do NOT support DFSR or DFSN.

However, there are great values in using these DFS technologies, in particular when the customer already has DFSR infrastructure set up in their global environment and they want to use DFSR to replicate building blocks of VMs amongst the different library servers, so that any time you need to provision a new VM, you can have nearby (relative to the target host) identical copies of the data that you need. If customers are willing to try, our doc team is planning to include documentation with suggestions on how you could set up and use DFS with SCVMM and what caveats you should be aware when doing so.

As a sneak preview, here are some of the facts about how VMM behaves with respect to DFS:

· VMM 2008 is not DFS-aware; neither is it location-aware. There is no special logic in our product for DFSR or DFSN.

· To VMM, each share is a separate entity, and each copy of the same file on each share is a separate entity. Hence, each copy of the same file on each share is given a unique GUID in the VMM database and is updated as a unique database object.

· VMM does not work with the global namespace in DFSN. However, without going into too many VMM implementation details, I’ll just say that users can decide to take advantage of DFSR with the costs of the following known issues :

o To take advantage of DFSR, you will need to create and manage a separate virtual machine template on local library share, which references a VHD copy on the local library share, , so that virtual machine creation uses the local copy of the VHD.

o Changes to any of the DFSR shares will result in missing content on other shares:

· When you move files around folder structures on one DFSR share, the file paths on that share are updated in VMM. However, on the library shares that contain the replicated files, the moved files show up as new files in VMM, and the original files (in the original paths) show up as Missing in VMM. You will need to manually clean up (remove) the missing files from other shares in the VMM library.

· When you deploy virtual machines out of one DFSR share in the VMM library, the virtual machine will appear as Missing in VMM for other DFSR shares and will need to be manually removed from those library shares. Hence, it may not be a good idea to store VMs to DFSR shares.

o Due to the following competing events, there will be minor network traffic every time the library refresher runs (by default, it runs every 60 minutes - although, this can be adjusted by turning down/off library refresher frequency from the library settings via the VMM Administrator Console).

· DFSR auto-replicates and keeps all file copies AND their paths in sync.

· VMM tries to tag each copy of the same file with a unique library ID.

 

 

Hope this helps and thanks for reading!

Cheng