Host Guardian Service - AD-based vs. TPM-based attestation

[This post is authored by Dean Wells, Principal Program Manager for the Windows Server Security Product Team]

Overview

The Host Guardian Service (HGS) is a new role in Windows Server 2016 that provides health attestation and key protection/release services for Hyper-V hosts running Shielded VMs. This blog describes the differences between HGS’ two mutually-exclusive attestation modes. For more information about the HGS role and how it’s configured, see the blog post here.

What is HGS attestation?

Generally speaking, attestation is a process in which the health of a given computer is measured in some way—typically by an external, trusted authority. In the case of Shielded VMs, HGS serves as the external, trusted authority and is used to measure specific health characteristics of Hyper-V hosts in order to determine if they’re authorized (authorized because they’re healthy) to run Shielded VMs. The process used to determine whether a Hyper-V host is healthy or not and the specifics of what we measure are dictated by HGS’ attestation mode.

NOTE: HGS’ attestation mode is configured during installation (by using the Initialize-HgsServer cmdlet) but can also be changed after the fact using the Set-HgsServer cmdlet.

HGS supports two mutually-exclusive attestation modes:

  1. AD-based attestation (sometimes written as Windows Server Active Directory based attestation)
  2. TPM-based attestation (Trusted Platform Module)

Let’s go through the requirements and basic setup process for each of the two modes and wrap things up with the assurance (security promise) differences between them.

1. AD-based attestation

With AD-based attestation, HGS measures only the group membership of the Hyper-V host that is attesting against it. Note that we’re measuring the computer account’s membership, not a user’s. Let’s talk a bit more about how that looks and how it’s setup. In most scenarios, AD-based attestation will use two forests/domains: one forest to which the Hyper-V hosts are joined (the fabricAD) and a second forest that is automatically created when HGS is first installed (HGS-AD).

This is probably a good time to point out that HGS can also use existing AD forests, i.e. not create its own forest during install. If this sounds like an appealing deployment option, ensure you’ve carefully considered that the HGS forest contains all of the servers running the HGS service and, therefore, it also contains the keys that can be used to compromise a shielded VM—in short, it’s contains the keys than can unravel a guarded fabric. For this reason, HGS typically resides in its own AD forest where the role is co-located with the domain controllers themselves. There are no particular technical requirements in order for an existing forest to be compatible with HGS’ needs but there are operational requirements and security-related best practices. Suitable forests are likely purpose-built serving one sensitive function, e.g. the forest used by Microsoft’s Privileged Identity Management solution. These sort of forests are suitable and usually exhibit the following characteristics: they have very few admins, they are not general-purpose in nature and the frequency of logons are low. General purpose forests such as CORP AD forests are not suitable for use by HGS. Since HGS needs to be isolated from fabric administrators, fabric AD forests fall very much into the unsuitable bucket.

OK, back to the AD-based attestation setup steps: next, a one-way cross-forest trust needs to be created from HGS-AD to fabricAD (i.e. HGS-AD trusts fabricAD). Next, an Active Directory global group (TrustedHostGroup) is created in the fabricAD domain and the computer accounts of the Hyper-V hosts that need to run Shielded VMs are added as members. Finally, the security identifier (SID) of the fabricAD\TrustedHostGroup is added to HGS by a trusted admin using the Add-HgsAttestationHostGroup cmdlet. At this stage, you’ve now enlightened HGS as to what a healthy host looks like when using AD-based attestation.

Active Directory trust diagram showing HGS trusting the fabric AD

When a Hyper-V host attests with HGS, the host’s identity and group membership are sent to HGS’ attestation service in the form of a Kerberos service ticket (hence the need for the trust). The service ticket contains the computer account SID of the Hyper-V host as well as the SIDs of any groups that the computer account is a member of (which includes the TrustedHostGroup). HGS compares this list of SIDs against its own list of trusted attestation host groups and if a match is found, the host is issued a certificate of health entitling it to request keys from HGS’ key protection service.

2. TPM-based attestation

With TPM-based attestation, HGS enforces far more rigorous attestation requirements than those used in AD-based attestation. As before, let’s walk through how we set this up and call out some of the setup differences as we go along.

When using TPM-based attestation, one of the more obvious differences is that no Active Directory trust relationship is required, or even recommended. Instead, the identity of a Hyper-V host is expressed to HGS by using a unique key found within each Hyper-V host’s TPM (the TPM must be a version 2 TPM)—the key is referred to as an EKpub or public endorsement key. You can extract the key from the Hyper-V host using the Get-PlatformIdentifier cmdlet. You must then add that key to the HGS service using the Add-HgsAttestationTpmHost cmdlet—HGS will now at least permit this host to attempt attestation but we’re not done yet.

Next, we need to teach HGS how to measure whether a Hyper-V host is healthy or not—there are two distinct parts to the measurement process:

  • a baseline policy
  • a code-integrity (CI) policy

The baseline policy contains measurements that describe the binaries loaded by the Operating System during the boot process. These measurements are extended into specific platform configuration registers (or PCRs) in the host’s TPM.

The CI policy contains a whitelist of binaries (drivers, tools, etc.) that are allowed to run on the Hyper-V host.

To create the baseline and CI policies, a trusted admin nominates an existing host (or builds a new Hyper-V host) that represents their definition of health. The baseline and CI policies are then extracted from this trusted and healthy host. Here’s the basic process and cmdlets needed (the exact syntax can be found in the Shielded VM deployment guide or via each cmdlet’s online help):

  1. Identify an existing healthy host or build a new one
    • if all hosts run on identical hardware and are installed and configured with identical software, then only a single baseline and CI policy is needed
    • if, however, some hosts are from a different hardware vendor or are a significantly different model from the same vendor, then generating a second baseline and CI policy is almost certainly required
  2. to extract the baseline policy from an appropriate Hyper-V host:
    • use the Get-HgsAttestationBaselinePolicy cmdlet
  3. to add the baseline policy to HGS:
    • use the Add-HgsAttestationTpmPolicy
  4. to extract the code integrity policy from an appropriate Hyper-V host:
    • use the New-CIPolicy cmdlet to generate the policy (this takes a while, e.g. 30+ mins)
    • then use the ConvertFrom-CIPolicy to convert it to the format that HGS needs
  5. to add the code integrity policy to HGS:
    • use the Add-HgsAttestationCiPolicy cmdlet

When a Hyper-V host attests with HGS, the host first sends its EKpub in order to prove that it is authorized to participate in the guarded fabric—HGS compares the EKpub to its known list and if it finds a match, the attestation process continues.

Next, the Hyper-V host must send over its baseline measurements contained in something called the tcglog. The tcglog (or trustworthy computing group log) can be thought of as a list of individually-measured binaries and the order in which they loaded—this is used to ensure unauthorized software (such as rootkits) are not loaded prior to the OS. During attestation, HGS compares the tcglog to its database of known-healthy baselines and if it finds a match, the attestation process continues.

Next, HGS uses the series of measurements contained in the tcglog to compute what the values in the host’s TPM should be. HGS then reaches back into the host’s TPM over a secure channel and verifies that the PCR-values match what it just computed. If they do, the attestation process continues.

Finally, the host sends over a hash of its CI policy which HGS then compares to its database of known-good CI policies and if it finds a match, the attestation process is affirmatively completed and a certificate of health sent back to the Hyper-V host which entitles it to request keys from HGS’ key protection service.

At this point, HGS now has the following:

  • a key that identifies who the host is (EKpub) indicating it’s potentially trustworthy
    • (where ‘potentially’ will be determined by the baseline and CI policy measurements)
  • a cryptographically verified list of the binaries that the host loaded
  • the host’s CI policy

Contrasting the respective requirements and vulnerabilities of the two attestation modes

We now understand the requirements, configuration differences and the specific measurements that are taken in each of the two modes in order to determine whether a Hyper-V host is healthy or not. The table below outlines some of the resulting differences relative to each attestation mode:

AD-based attestation TPM-based attestation
1. Hyper-V host hardware and software requirements? ·   Windows Server 2016 Datacenter Edition·   No specific hardware requirements beyond what Hyper-V itself needs (SLAT, etc.) ·   Windows Server 2016 Datacenter Edition·   Hyper-V host hardware must provide:
  • UEFI 2.3.1 rev. C or later
  • Secure Boot / Measured Boot
  • TPM v2
2. Host Guardian Service (HGS) hardware and software requirements --------------- no differences ---------------> ·   Windows Server 2016 Server Core and up·   The hardware need only be able to run Windows Server 2016 Server Core and up
3. What do we measure & attest to in order to permit Shielded VMs to powered-on or to be live migrated to a new host? ·   The Hyper-V host must be a member of a designated/trusted AD group whose SID (security identifier) has been configured on HGS ·   The Hyper-V host computer’s boot process including that it’s using secure, measured boot·   The host’s Operating System and drivers·   The host’s code-integrity policy·   Various other aspects such as is a debuggers attached -to the host --> NOT permitted
4. What protections does the Shielded VM receive? --------------- no differences ---------------> ·   A version 2 compatible TPM·   UEFI firmware with secure, measured boot support·   Encrypted disks with secure, TPM-backed key-release·   Encrypted Live Migration traffic
5. Which guest Operating Systems can be shielded? --------------- no differences ---------------> ·   Windows 8 and later·   Windows Server 2012 and later
6. Supports both ‘Shielded’ mode and ‘Encryption supported’ mode? Yes Yes
7. Provide some examples of how a guarded host or a shielded VM might be attacked ·   The AD admin is bribed or blackmailed and adds a compromised Hyper-V host to the trusted group in AD·   The Hyper-V admin installs malware on a Hyper-V host·   The HGS admin is bribed or blackmailed and weakens the attestation requirements·   An attacker compromises the identity of a legitimate HGS admin·   Hyper-V host firmware or platform attacks that enable the attacker to obtain keying material ·   The HGS admin is bribed or blackmailed and weakens the attestation requirements·   An attacker compromises the identity of a legitimate HGS admin and weakens the attestation requirements·   An attacker abuses administrative privileges and manages to obtain guardian (private) keys or transport keys for specific shielded VMs·   Hyper-V host firmware or platform exploits that enable the attacker to obtain keying material