Troubleshooting MDT 2012 Monitoring

I mentioned a while back that I wanted to do a blog post talking about how to troubleshoot the new MDT 2012 monitoring feature for Lite Touch deployments, but first I had to actually describe it.  If you haven’t reviewed that post, you might want to check it out first at https://blogs.technet.com/b/mniehaus/archive/2012/03/09/mdt-2012-new-feature-monitoring.aspx.

So now let’s talk about troubleshooting.  First, let’s look at the server side.  You have to enable monitoring on a computer that has MDT 2012 installed.  When you use Deployment Workbench on that computer and check the box to enable monitoring, Workbench will first check to see if the specified monitoring host name is local:

image

It doesn’t really matter if you specify an IP address, a short host name, or a fully-qualified host name, as long as the clients can resolve whatever you specify.  If you specify a name that Workbench doesn’t think is local (because Workbench itself can’t resolve the name back to an IP address assigned to the current machine), it won’t try to install the monitoring component; instead, it will try to contact that server to see if monitoring is running on that computer.  If it is, great; if it isn’t, you’ll see an error message:

image

If you look closely at the error “tip” at the end of the “Monitoring host” line, you’ll see a message like “Unable to connect to the specified server and port”.  If you think you specified the local computer name and got that error, then Workbench couldn’t figure out that it was the local computer name (something that’s harder to do than you might think).  If you are specifying a different server and see this error, then it’s having problems communicating with that other server.

Tip #1:   Make sure the name you specify in Workbench can be resolved to the IP address of the current machine.

What does the checkbox do?

You’ve checked the checkbox and can’t see that anything happened.  So what was actually done?  Two things:

  1. A new “Microsoft Deployment Toolkit Monitor Service” service was installed on the computer and started.

  2. An additional entry was added to the [Default] section of CustomSettings.ini telling the clients how to contact the server, with an entry such as:

    EventService=https://mdt-server.mdt.local:9800

Tip #2:   Make sure the “Microsoft Deployment Toolkit Monitor Service” is installed and running.  If it’s not installed and it should be, you can uncheck the box, click apply, then check the box again and click apply to reinstall it.  If it’s installed but not running, try to start it.

Tip #3:   Make sure the entry was added to CustomSettings.ini by looking at the Rules tab.  Because of a peculiarity with the way Workbench works, if you make any changes to the Rules tab after you’ve clicked the “Enable monitoring” checkbox but before you’ve clicked OK, it’s possible that the changes made on the Rules tab overlay the EventService entry in CustomSettings.ini, but it’s easy enough to put it back manually.

What if the service doesn’t start?

The service has two dependencies:

  1. .NET 3.5 SP1 needs to be installed.  That shouldn’t be an issue, because you can’t install MDT 2012 without .NET 3.5 SP1.
  2. The ports you specified need to be available for use.  (Generally that’s not an issue either, as 9800 and 9801 aren’t commonly-used TCP ports.  But it is possible to have another application use them.  Fortunately, MDT will happily use other ports.)

So there’s no dependency on IIS or SQL Server.  The service uses .NET to host a web server as part of the service process, and it uses a SQL Compact database (basically a set of DLLs, which ship with MDT, that run in the service process) to store the monitoring information.  It’s designed to be easy to install and run.

Tip #4:   If you try to start the service and it won’t start, that most likely means the ports you chose were already in use.  (If you want to know what’s using the ports, use a tool like TCPView, available from https://technet.microsoft.com/en-us/sysinternals/bb897437.)  Pick different ports.

While it’s always possible that there could be some other reason the service fails, I haven’t seen any other causes.  But if you know the ports are not in use and the service still won’t start, capture a trace using DebugView (https://technet.microsoft.com/en-us/sysinternals/bb896647) to see if it provides any further clues.  If not, contact Microsoft Support for assistance.

Verifying the Monitoring Service

The monitoring service listens on the two ports that you specified.  The first of these ports (9800) is used by computers being deployed to send progress events.  The second (9801) is used by Workbench itself to query information about deployments being monitored.  To make sure these ports are accessible, we can manually connect to each one using Internet Explorer.

To verify the “event port” from the monitor server itself, you can use a URL such as:

https://localhost:9800/MDTMonitorEvent/

If that works, you should see a response like:

image

That’s a proper response in this case – the web service doesn’t expect to be called in this way (an HTTP GET request instead of an HTTP POST request), so it’s telling you the proper way to call the service.

To verify the “data port” from the monitor server itself, you can use a URL such as:

https://localhost:9801/MDTMonitorData/

image

This response (which is an ODATA feed in case you are curious) confirms that the data feed is working as expected.

But those are the easy queries – they are using “localhost”, which is almost never subject to firewall restrictions.  Next, you need to try these queries remotely, using the appropriate “remote” URLs:

https://mdt-server:9800/MDTMonitorEvent/

https://mdt-server:9801/MDTMonitorData/

If those work, great.  If they don’t, then you need to make sure that whatever firewall is running on the monitoring server allows the ports you specified (e.g. 9800 and 9801) to be accessed from remote hosts.

Tip #5:   Make sure you can access the monitor service ports both locally and remotely.  Adjust the firewall rules as necessary.

Note that there are other networking “challenges” that can get in the way, e.g. IPSec domain isolation.  In this configuration, computers that aren’t domain-joined, e.g. running from Windows PE, can’t talk to domain-joined computers because they aren’t using encrypted IPsec communication.  This type of configuration will never work – you would need to set up the monitoring service on a “boundary server” that has been configured to allow non-IPsec traffic on the configured ports.  So don’t assume that if a “remote” URL works from a domain-joined machine, it will also work from a workgroup machine (or Windows PE) – know how your network is configured.

From the Client Side

When the EventService task sequence variable is set (via the processing of CustomSettings.ini), each MDT script executed in the task sequence will send an event to the monitor service on the “event port” URL.  When this succeeds, you’ll see a message like this:

image

If a script is unable to send an event, you’ll see something different:

image

That’s a clear sign that something isn’t right.  Make sure the service is running, that the firewall ports are open, etc. – the same challenges we already reviewed.

Tip #6:   Check the client logs to make sure the clients are able to talk to the monitoring service.

Another way you might notice an issue:  If the monitor service isn’t running, the clients will still try to connect to it, eventually timing out.  This timeout process will cause a delay at the end of each step in the task sequence, so if you are watching the task sequence progress dialog, you’ll see steps that you never noticed before (because they usually run so fast) now taking a long time.

From Workbench

When you try to look at the monitoring data from Workbench, it calls a PowerShell cmdlet (Get-MDTMonitorData), then that PowerShell cmdlet makes the “data port” query to retrieve the details from the monitoring service.  If the service is working as expected, you can see the list of monitored machines in Workbench.  If the service isn’t working, you’ll see something like this instead:

image

Good advice, make sure the service is running Smile

Finally

Still having issues?  Post them as comments here, or send me an e-mail at mniehaus@microsoft.com and we’ll try to figure out what’s going on.