So you’ve deployed a guarded fabric and your VMs are running happily. Having now reached that perfect steady state, let's have a look at the operational and administrative differences relative to a regular fabric. The purpose of this blog isn’t to exhaustively walk you through some mundane day-to-day set of administrative or operational duties, rather, I want to call out:
- The aspects of a guarded fabric that differentiate it from a regular fabric
- The impact of losing any of these guarded-fabric-specific artifacts
- What, if anything you can do to recover from that loss
What are the ‘new’ things we need to concern ourselves with?
Maintaining a fabric of regular virtual machines on any hypervisor platform pretty much boils down to the same set of administrative and operational tasks & duties: backup the VM definitions, backup their disks, etc. For a guarded fabric, however, there’s a small number of artifacts that are specific to running and maintaining shielded VMs:
- Shielding data (PDK files)
- Shielded template disks
- Volume Signature Catalog files (VSC files)
- Virtual Trusted Platform Modules (vTPMs)
- BitLocker recovery keys
Let’s walk through each one:
1. Shielding data (PDK) files
Shielding data (a PDK file) contains the secrets necessary for tenants (or, if you prefer, a virtual machine owner) to securely deploy shielded VMs. PDK files are created by VM owners using the Shielding Data File wizard (which is included with Windows Server 2016 and the Remote Server Administration Tools (RSAT) and uploaded to the fabric where their shielded VMs will ultimately run. The PDK file is essentially an encrypted bag of secrets that contains, among other things, the following:
- an unattend file used to specialize the VM during provisioning
- an RDP certificate to secure RDP communication with the VM once it's deployed
- a setting indicating whether the PDK is used to create new shielded VMs or convert existing VMs to shielded (see the note below)
- the list of guardians that define which guarded fabrics the shielded VM can run on
- a setting indicating whether the security policy of the new VM is encryption supported (weaker) or shielded (Hyper-V 2016's strongest setting)
- one or more volume ID qualifier rules and their associated volume signature catalog file (more on that in a moment)
The guarded fabric uses PDK files when provisioning a new shielded VM and also when converting an existing (regular) VM to a shielded VM.
Note: As implied, you cannot convert a regular VM to a shielded VM using shielding data that was designated for new VMs only. This is because shielding data designated for new VMs might contain arbitrary secrets put in there by whoever created it. If that same shielding data were later used to convert a VM owned by an attacker to a shielded VM, then the secrets inside the shielding data would have been deposited on the malicious VM's disk unencrypted which probably isn't good. Hence, the setting and enforcement logic to block it.
All of that said then, what happens if you lose the PDK file? Well, assuming you have a copy of all the things kept inside it then losing it merely requires that you re-create the PDK using the Shielding Data File wizard.
2. Shielded template disks
If you already understand the purpose a template disk serves in a fabric of regular VMs, then you're pretty much there with shielded template disks. It’s a regular VHDX file with a Sysprep’d copy of Windows but it's signed at a trusted time by a trustworthy administrator.
To create a shielded template disk, simply create a template disk in the same way you always have and then run it through the Template Disk Signing wizard, another tool in Windows Server 2016 and RSAT. This tool creates a cryptographic signature based on critical parts of the template disk (the OS partition, for example) as it exists at that precise time. The signature is created using a certificate of the administrator's choosing. This signature is then stored on the EFI (the system) partition of the now-shielded template disk.
The certificate used for signing is sensitive and must be considered a secret since possession of it allows an attacker to sign arbitrary template disks that could contain malware . An administrator then extracts the signature from the shielded template disk and saves it in a volume signature catalog file (which, as you already know, is stored in shielding data files).
Later, during shielded VM provisioning, the signature of the shielded template disk is computed once again and compared against the original signature & signing certificate to determine if the shielded template disk has been tampered with. Assuming it hasn't, shielded VM provisioning proceeds as normal.
What if you lose a shielded template disk? Just recreate it (or vow to never deploy another new shielded VM again which doesn't seem like the right approach to me). Stated another way, there's nothing unique about a shielded template disk except what a trusted administrator might have put on it.
What if you lose the template disk signing certificate itself? If it’s destroyed accidentally, tenants won’t be able to use existing shielding data with any new template disks because they’ll have been signed by a different certificate (you lost the original one, remember).
As already noted, if the signing certificate is stolen, an attacker can sign any template disk and convince the shielded VM provisioning engine that everything's just peachy because it's signed with the blessed certificate--that's really very bad indeed and all existing PDKs should be edited to remove their trust in that now-stolen certificate.
3. Volume signature catalogs (VSC files)
As noted above, shielded template disks have a cryptographic signature stored on them that represents the disk at a trusted time. That signature can be extracted and stored in a VSC file which is, in turn, stored in a shielding data (PDK) file and used during provisioning to ensure that the template disk hasn't been tampered with since being signed.
If you lose a VSC file, you can simply extract it again from the parent shielded template disk.
A vTPM is exactly as its name implies, a virtualized trusted platform module that behaves in the same way as normal V2 TPMs. The vTPM of a virtual machine is not bound to its Hyper-V host's physical TPM in any way whatsoever--it's entirely synthetic. Shielded VM's encrypt their OS disk and, while a bit of an over-simplification, the keys used to encrypt the OS disk are sealed inside the vTPM. To seal keys inside a TPM (whether it's virtual or otherwise) means that the keys are locked to a particular set of boot + OS measurements and will only be released if the measurements are the same as they were at the time the keys were last sealed there. The term measurements describes certain firmware variables and a set of hashes of the binaries that comprise the boot process and some of the OS itself.
As is true of virtual machines whose configuration and state is stored as files on a disk, the same is true for a vTPM. It's worth noting, though, that the vTPM is encrypted on disk. You can deduce then if a shielded VM's vTPM is either lost or cannot be decrypted, the shielded VM's BitLockered disk also can't be decrypted. Hence it's important to ensure that a shielded VM (or any VM with a vTPM device added to it on a Hyper-V host running Windows Server 2016 or later) is backed up using tools that understand that the VM is more than just a VHDX and a bunch of arbitrary configuration entries in a text file.
What if a shielded VM's configuration, including its vTPM state, is lost but its VHDX is preserved? Adding that VHDX to another VM will cause the VM to boot into BitLocker recovery and you'll need the BitLocker recovery key to complete the boot process.
Note: Guarded fabrics do NOT automate the creation/backup of BitLocker recovery keys--this is the responsibility of the VM owner or the VM owner's IT department.
Guardian is the term we use to describe the pair of certificates--one encryption, one signing--that protect the symmetric encryption key that is used to encrypt a shielded VM's vTPM (I'd advise that you read that sentence again). Guardians themselves aren't secrets because they only contain public keys (make sure the certificates you use to create the guardian honor this assumption, i.e. the certificate itself doesn't contain the private keys); the private keys of a guardian should be maintained by the Host Guardian Service (HGS). Guardians spend most of their lives indirectly protecting a shielded VM's vTPM. When a new shielded VM is provisioned, the guardians protecting the key that actually encrypts the vTPM are copied from the shielding data file and written to the vTPM's key protector (KP). It's not unreasonable to think of a KP as something akin to an ACL on a file. In summary :
- Each HGS cluster has a default guardian for which it exclusively possesses the private keys
- Each VM owner who creates a PDK file also has an owner guardian--in this one instance, the private keys are maintained outside of HGS by the VM owne
It's logical then to say that PDKs/KPs typically contain at least two guardians: the VM owner’s guardian and one or more guardians that represent the guarded fabrics where the VM is permitted to run--remember, the guardians within the PDK/KP should never contain the private keys.
With all that said then, what happens if you lose a guardian? Well it depends--did you lose the public key, the private key, or both? Or perhaps you lost the PDK in which the guardian lives. If you merely lost the PDK in which the guardian lived, then simply re-create a new PDK file and add your guardian to it. If you lost the default guardian from your Host Guardian Service, simply download the metadata and use it to re-create the guardian. There's a laundry list of ways you could lose a guardian but the reality is this: the only thing that really matters about a guardian is its private key because that is needed to begin the process of decrypting a vTPM--lose that and you're one step closer to losing the whole shielded VM.
6. BitLocker recovery keys
As shielded VMs running Windows use BitLocker to encrypt their OS volume, the BitLocker key is sealed to the vTPM. It is therefore possible in rare cases for the shielded VM to trip BitLocker recovery. Since guarded fabrics do NOT automate the creation or backup of BitLocker recovery keys, it is important to understand that this requirement exists for shielded VMs and must be met through normal Windows operational procedures. If BitLocker recovery is tripped and you do not possess the recovery keys, then the OS volume cannot be decrypted and the VM will no longer boot.
Can it be recreated?
|Shielding Data File (PDK file)||Yes||Assuming you still have
everything that went inside it
|Shielded Template Disk||Yes||Rebuild the template disk and
run the Template Disk Signing
wizard against it
|Volume Signature Catalog
|Yes||Re-export it from its parent
shielded template disk
|vTPM||No||Each VM's vTPM is unique and
changes frequently over its
|Possible data loss|
|Guardian||Yes||Obtain the encryption and
signing certificates used to create
the guardian and recreate it
|BitLocker Recovery Key||No||If a shielded VM has tripped into
BitLocker recovery and you do not
have the recovery key, all encrypted
volumes are lost
and update your resume