Overview of Host Guardian Service (HGS) Diagnostics

[This post is authored by Jim Hughes, Software Engineer for the Windows Server Team]

The Host Guardian Service (HGS) is a principal component in enabling Hyper-V to host Shielded VMs in Windows Server 2016. Shielded VMs are your typical Hyper-V virtual machines, but protected from tampering and inspection by platform administrators and malicious actors.

The initial deployment of HGS is a complex task that encompasses the management of multiple roles and features (Active Directory, DNS, Failover Clustering, IIS, and Hyper-V) in addition to infrastructure management tools (Group Policy and System Center). That was a lot for me to remember to write down in this post—putting all of these pieces together in a production deployment is even more difficult. The problem only compounds when something goes wrong and your HGS deployment stops functioning—where does one start with an environment so complex?

To solve this problem, we designed a set of PowerShell cmdlets for diagnosing HGS and its supporting infrastructure. These cmdlets let you spend less time guessing and checking, reducing the time it takes to deliver shielded VM’s to your customers. If things go wrong later on, you can minimize the impact by quickly triaging various configuration points, checking for frequent missteps we’ve noted during the past four technical previews.

What’s in the Box

HGS Diagnostics are available in Windows Server 2016 Technical Preview in both the Host Guardian Service role and the Host Guardian Hyper-V Support feature. This means that all diagnostic tools are available on both your guarded hosts and HGS cluster. To learn more about deploying HGS, read the deployment guide.

HGS Diagnostics 101

Diagnostics are accessed using the Get-HgsTrace cmdlet. This can be executed remotely using PowerShell remoting or locally from a PowerShell prompt. To audit the local machine, run Get-HgsTrace with the -RunDiagnostics switch (without the -RunDiagnostics switch, trace data is collected from the host but not analyzed; this is useful for those who are willing to get their hands dirty to manually diagnose a tricky issue).

HGS-Diagnostics01

A report is generated that details any issues identified on the local system. To see everything that was tested and not just noteworthy results, provide the -Detailed switch. Each failure message specifies what went wrong and how to remediate the issue. In this case it looks like I forgot to restart after installing a new code integrity policy.

If the test detects no issues but a problem is still occurring, you can immediately narrow the scope of your investigation to items not verified by the diagnostics.

HGS Diagnostics 202

We’ve just scratched the surface of what this tool can do. You can even diagnose multiple hosts at once with the New-HgsTraceTarget cmdlet—diagnostics can use the increased knowledge of your deployment to find issues that could not be identified by looking at each host in isolation. To learn more, read the documentation available on TechNet.

Disclaimer: This is still pre-release software and as we continue to iterate, there may be changes to the syntax and functionality of the diagnostic cmdlets.

Happy triaging!