Azure template to deploy a forest with two domains, Part 2 -- understanding the template structure

This is the second blog in a 3-part series. This is all of them:

In the first post I showed an Azure (ARM) template to deploy a forest with two domains and all required objects such as VNETs, storage groups, etc.

My goal for this part is to explain how I arrived at the end result, to help you on your way if you have similar challenges, and to help you avoid the problems I encountered. Before I continue, I must acknowledge my starting point: the excellent domain deployment template by Simon Davies. Thanks Simon, I would not have gotten this off the ground without your efforts.

From a high level this is what needs to happen, in this particular order.

  1. Create a storage account.
  2. Create availability sets for the DCs for each domain.
  3. Create the VNET and subnet.
  4. Create the Network Security Group.
  5. Create VM1, and promote the forest.
  6. Create VM3, and create the child domain.
  7. Create VM2 and VM4, and promote them as additional DCs in their domains.

If you know AD a bit, you can see the reasons behind this order. The one that tripped me up was (6), creating the child. Initially, I created the root with two DCs. However, it turned out that sometimes the DNS on VM2 was not quite ready, so when creating the child domain in VM3 this would fail. Sometimes. Most of the time it would be fine. I had to analyze the dcpromo logs to find out what was failing, and why. The learning point here was: create the base configuration first, then scale out.

Aside from having to manage the dependencies carefully to make sure things happen in the correct order, we also need to do stuff inside the VMs: change DNS references, create the forest, add a domain, install admin tools, etc. There are two ways to go about it: custom scripts, or Powershell Desired State Configuration (DSC). DSC itself exists in two variations as well: DSC native to azure, and DSC as a VM extensions. Because most of the DSC components that I need already exist I preferred to use DSC, and in particular, the version using VM extensions. The challenge with DSC and scripting is that you run code inside the VM that could be unpredictable, depending on timing, OS details, etc, etc. It turned out to be hard to make it reliable. What I learned: when using ARM templates, avoid DSC if you can... but in this case there was obviously no choice.

When developing complex templates like these, the matter of tools is very important. You really want to simplify your life as  much as possible. For instance, I quickly realized that I needed to call sub-templates from the main template. This is very hard to do from Visual Studio, so my choice was to develop with GitHub as the main repository. If you are serious about developing ARM templates you need to know GitHub. It hosts your files, documentation, allows you to work with other people, and importantly, supports mature version control including branches and releases, and the ability to revert to previous points in time in case you make a mistake.

But, the GitHub editor for templates (using JSON) is not great. Visual Studio 2015 is much better at it. The free community edition of Visual Studio will do. So I wanted to use Visual Studio together with GitHub. There are various options for integration, but what it boils down to is that you must sync your development machine (laptop) with GitHub. After trying multiple options I ended up with GitKraken, an excellent local GUI/sync engine. You have multiple choices here; I'm describing here what works for me, but your most convenient setup may well be different.

Let's have a look at the templates and DSC structure.

  • DSC\
    • ConfigureADNextDC.ps1.zip
    • CreateADChildDomainDC1.ps1.zip
    • CreateADRootDC1.ps1.zip
  • nestedtemplates\
    • configureADNextDC-yes.json
    • createChildDomain.json
    • createForest.json
    • configureADNextDC-no.json
    • CreateAndPrepnewVM-no.json
    • CreateAndPrepnewVM-yes.json
    • vnet.json
    • vnet-with-dns-server.json
  • azuredeploy.json
  • README.md
  • do-nothing.json
  • azuredeploy.parameters.json

Yes, that is a lot, and no, I'm not going to discuss all of it. Feel free to dissect as much as you like though. Keeping to the main parts, there is a toplevel folder with the main template called azuredeploy.json. Stick to this filename. The rest of the world will assume that a deployment uses it as the starting point.

There is a subfolder DSC with three zip files. All of them are DSC containers, containing the DSC template and resources. The template contains references for deploying the zip files, which is one of the reasons that a public repository like GitHub is so convenient. One decision I made early on was to stick with the standard DSC resources, and to not modify them even when this would be convenient. Any modification would make it impossible to update the resources with a new, public version. I am not going to discuss how DSC works, but will just mention the DSC resources I used:

  • xNetworking, to control the network configuration, in particular DNS references.
  • xDisk, to format and mount disks.
  • cDisk, an community extension to xDisk with the ability to wait until a disk comes online.
  • xActivedirectory, to manage forests, domains and Domain Controllers.

One important DSC resource that I did not use is xDnsServer. After extensive testing I had to give it up. It was just not reliable directly after DC promotion. I wanted to use it to configure DNS forwarders, but had to resort to a custom DSC script resource. This one too took some time to make reliable. Look at the code to see what I mean: catching exceptions, retry loops, validations, etc.

Another problem I encountered was that not all resources worked equally well with various operating systems. Windows Server 2008 R2 was a nonstarter  because almost nothing worked. Windows Server 2012 (R1) needed an update to the Powershell execution policy before DSC would work. My baseline was Windows Server 2012 R2, meaning that all new features in my template would work first here. Windows Server 2016 mostly worked directly, but due to different timing it exposed several dependency problems. Lessons learned: when using DSC, use but do not modify public DSC resources, test them well, and test each operating system that you intend to support with your template.

The file list above shows the optional readme.md. The "md" part means Markdown. If you have never looked into this: it is a markup language like HTML, but much simpler. There are no tags as such, just simple hints telling the Markdown process that you want something in bold, or that something is a header etc. It is meant to be readable in raw form, contrary to HTLM. GitHub is natively aware of the Markdown format, where it is used to document projects. The readme.md file is used as the intro page for each project at GitHub.

But still, editing Markdown is not so easy if you don't do it every day. So after some searching I ended up with two resources: a nice Markdown cheat sheet at GitHub, and a Markdown Editor (plugin) for Visual Studio. This plugin shows a real-time preview of what the rendered Markdown looks like.

Let me finish up with a couple of web resources that I used a lot.

This is a great link for best practices, especially after you have some experience and spent hours debugging issues that turned out to be trivial:

Using linked templates and sub-templates, essential when you are looking to re-use templates. In my template you will see this used in a couple of places, for example to create a generic VM. It also explains how to implement yes/no choices in a template.

There are two DSC models for Azure. The first one is native, and uses an Azure-based repository. However, that requires authentication and is not what I had in mind. The second one is less fancy and uses a VM extension in the Azure agent to do its job. It was exactly what I needed.