Why Hyper-V VHD Files Are So Large – And How To Efficiently Copy Them

While in Seattle for TR 10 I had an opportunity to meet up with an innovative startup called VMUtil.com (www.vmutil.com). Led by a Microsoft File System / Storage MVP, they have come up with some excellent tools to make Hyper-V more efficient with non-SAN storage solutions and add value when SAN storage is used. This blog will cover one tool VHDCopy which can optimize fixed size VHD file copies and its sister tool, VHDCopy Enterprise Edition aka VHDCopEE. I found these tools of special interest because they resolved an existing issue I was having in a large hyper-v environment.

Hyper-V VHD files tend to be very large for a number of reasons:

  1. Hyper-V file sizes Microsoft best practices suggest using fixed size VHDs in a production Hyper-V environment.
  2. Erring on the side of caution requires creating a larger VHD size – if in doubt whether to create a 60GB or 80GB VHD file, the answer is to create the 80GB VHD file
  3. If VSS based snapshots are desired, it is a requirement, albeit an often overlooked one, that the VSS differencing store must also be part of the same VHD. This area is as large as 15% of the total VHD size

The net result is that a typical VHD will be fairly large but will contain a large area of free space as seen by the NTFS file system within the VHD file.

Imagine you are copying a 80GB VHD file, with NTFS inside the VHD believing it has an 80GB VHD file with 60GB free.

  • Xcopy or copy command or Robocopy will copy the full 80 GBs including the 60GBs worth of “free space”. VHDCopy and VHDCopEE will copy only the 20GB
  • If this were a SAN storage that is thin provisioned, the 60GBs of unused space is not stored (assuming the VM has never run. If the VM has run, there are some issues that are food for another blog) . A backup run from the Hyper-V parent partition would cause 60GBs worth of zeros to cross the PCI bus on the source volume (of the backup). A backup run from within the VM would of course only copy the 20Gbs. But that requires a costly agent inside each VM.
  • Similarly, even when the VHD is on a thin provisioned SAN disk, xcopy and RoboCopy would cause 80GBs of data, including 60GBs worth of zeros to cross the PCI bus on the source volume
  • VHDCopy and VHDCopEE would ensure that only 20GBs of data cross the PCI bus on the source volume
  • If the VHD file is being copied across a network, VHDCopy and VHDCopEE would ensure that only 20GBs worth of data flows across the network
  • When run with the /Secure option, and assuming the destination volume also supports NTFS semantics, VHDCopy and VHDCopEE very cleverly ask the destination file system to zero fill the 60GBs. So even though the 60GBs of data never flows across the network, the destination VHD is still securely zero filled!
  • It does not matter whether the source VHD has been securely zero filled or not – there is no need to run Secure Delete aka SDelete. VHDCopy and VHDCopEE can identify unused parts of the VHD without that requirement
  • In other words, VhdCopy and VHDCopEE are the equivalent of a fast accelerated copy + running SDelete and will make sure the destination VHD is thin provisioned if the block storage volume at the destination supports that. You can actually use VHDCopy to decrease storage for a thin provisioned SAN volume – all the parts of the NTFS volume deleted by NTFS but not yet freed will not occupy storage in the destination. The details of savings will depend upon your SAN storage.

Possible use cases for VHDCopy and VHDCopEE

  • Moving VMs from a test server to a production server . Export only the VM configuration and import it, using well published scripts such as https://bit.ly/4cc77G and then use VHDCopy or VHDCopEE to perform an accelerated VHD file copy
  • Creating a base image for Microsoft System Center Data Protection Manager
  • Backing up a VM
  • Migrating a VM from Windows Server 2008 and its single VHD per LUN model to Windows Server 2008 R2 CSVs and its 4-8 VHDs per LUN model
  • Checking in or checking out a VM to/from Microsoft SCVMM
  • Copying or backing up iSCSI LUNs on Windows Storage Server

Pricing

Please contact sales@vmutil.com for pricing and licensing.

VHDCopy is priced per physical server and is available for Windows Storage Server, Windows Server 2008, and Windows Server 2008 R2. Windows Storage Server uses fixed size VHDs to implement the iSCSI LUNs. VHDCopy can push a VHD file from a local volume to a network volume, or pull a file from a network volume to a local volume. VHDCopy does not support CSV volumes

VHDCopEE Enterprise Edition is also priced per physical server and is available for Windows Server 2008 R2 only. VHDCopEE can pull a volume from a network volume to a local volume or a CSV volume. VHDCopEE can also push a VHD file from a local volume or CSV volume to a network volume. The CSV can be in direct access mode or in redirected access mode.