One of the first things I wanted to play with in my Windows 8 lab was the new data deduplication feature.
In my case, I decided to make a small volume and see how well it worked with VHD files. Well, I’m happy to announce that it works pretty well!
First of all, you need to have the File and Storage Services role installed. Make sure you drill down in there and get ‘data deduplication’ checked.
Once you get that going – you need to configure deduplication on a volume. It cannot be the C:\ drive. In my test server I have a giant RAID 5 array so I used disk management to peel off 100GB and created a F:\ drive that I named “DEDUP”.
You can now enable deduplication and configure the options to suit your environment.
It’s important to note that deduplication is not ‘real time’. The optimization process runs every hour but you can force it to run manually using some simple PowerShell commands. (This is great for demo’s when you want to copy a file in the directory, for example, and then immedately show the effect of dedup’ing)
You can trigger an optimization job on demand in PowerShell using the Start-DedupJob cmdlet. For example:
You can query the progress of the job on the volume by using the Get-DedupJob cmdlet:
PS C:\> Get-DedupJob
The Get-DedupJob command show current jobs that are running or are queued to run.You can query the key status statistics including the achieved savings on the volume by using the Get-DedupStatus cmdlet:
PS C:\> Get-DedupStatus
In my case, with all VHD files (and a mix of Windows 7, Windows 8 Client and Server), I saw some pretty significant space savings on the dedup’d volume.
Properties on the disk shows me:
What I actually have on the drive:
A Windows 7 VHD @ 7.4GB with 3 copies. This would use ~22GB without dedup.
A Windows 8 Client CTP VHD @ 9.2GB
A Windows 8 Server VHD @ 9.0GB
So, total I would have seen ~40GB of space used without dedup.
With Windows 8 Deduplication enabled:
Nice job Windows Server team!
Worth noting…in a dual-boot scenario…what happens when you are in another OS and want to access that dedup’d volume?
- Any file that was deduped with server will not be available (You will be able to see the file system Reparse Points that define the optimized file stub for the deduped file)
- Any file that was not deduped will be available