Agents on Windows 2012 R2 Domain Controllers can stop responding or heart-beating


 

This is an issue I have been tracking for some time.  When you deploy SCOM 2012 Agents on Windows Server 2012 R2 Domain Controllers, it is possible for the agents to stop responding and or sending heartbeats.  The agent services will still be running.  You will see the events in the OpsMgr event log stop processing, and you might see heartbeat failures as well.  I have personally seen this on both of my Windows 2012 R2 domain controllers, which also run DNS and DHCP, and have the AD, DNS, and DHCP management packs imported.

This is caused by an issue in the Server OS (Windows Server 2012 R2), which is outlined at http://support.microsoft.com/kb/2923126

There is a hotfix, which addresses the issue, which is included in the Feb 2014 update rollup hotfix:  http://support.microsoft.com/kb/2919394

I recommend you consider deploying this update if you are deploying DC’s on Windows Server 2012 R2.  This issue could potentially affect any server running Windows Server 2012 R2 operating system, I have just experienced it on DC’s thus far.

If you use Windows Update – this hotfix is listed as “Optional”:

image


Comments (28)

  1. Kevin Holman says:

    @Rodrigo –

    I will never comment on when a UR might or might not ship. I will only speculate…. since these are developed quarterly at best, and SCOM 2012R2 UR1 shipped at the very end of Jan, I’d not expect a following UR until the very end of April at the absolute soonest…. again, that is only speculation. Even if I knew when a UR might ship, I would not comment until they were public.

  2. Kevin Holman says:

    @Tao Yang –

    Yep, me too. I have just been waiting on the hotfix to announce it. 🙁

  3. Kevin Holman says:

    James – that’s interesting. I have not seen that at all. Are you monitoring for memory utilization on WMI? What kind of error alerts are you seeing?

  4. Kevin Holman says:

    @Klaus –

    You should not see that issue immediately after a reboot. The events might have been a flood of older ones catching up. There is an issue with WMI leaking on Domain Controllers with DNS on WS2012R2 (and possibly previous versions). I am planning a blog article on that.

  5. Daniel Ovadia says:

    Thank you Kevin! It works

  6. Kevin Holman says:

    There is a WMI leak that remains, after this hotfix. This hotfix repaired the issue where the agent would immediately stop processing. The WMI issue is with the DNS PowerShell provider, but I don’t have an ETA on a fix just yet.

  7. Daniel_Australia says:

    I’m having this issue – I downloaded the KBs that I think should fix the issue – Windows8.1-KB2919355-x64.msu and Windows8.1-KB2955164-x64.msu, but when I try to apply them to my Server 2012 R2 DCs, I’m told they don’t apply. Have I downloaded the wrong
    versions perhaps? If anyone can be more specific about the hotfix or update files required, that would be very much appreciated!

  8. Cheers Kevin, this had been giving me no end of grief!!

  9. Anonymous says:

    This is updated as of 3-3-2014 In general – you should evaluate all hotfixes available, and only apply

  10. Good to know Kevin, thank you… I always appreciate your posts. Do you use twitter? Would like to follow if you do. @reidartwitt http://johansenreidar.blogspot.no

  11. Tao Yang says:

    Thanks Kevin. I’ve been having this issue in my lab ever since I’ve upgraded everything to 2012 R2!

  12. Thanks Kevin. This was bugging me for a long time.

  13. Anonymous says:

    Active Directory is the cornerstone of many networking environments. Active Directory Domain Controllers

  14. rodrigo says:

    Thanks, Kevin for articles, you can tell if the next UR2 for SCOM 2012 R2 will leave in April?

  15. Arjan Vroege says:

    Many thanks for this post. Have seen this in my lab.. It was a very annoying problem.

  16. Klaus Runggaldier says:

    After installing the update rollup my 2012 R2 DC started reporting to SCOM 2012 R2 again, but opened a lot of alerts in regards to WMI (corrupted repository, though it is perfectly fine), DNS Server (service not started though it is) etc. I think the SCOM agent has some problems connecting to WMI? Anyone else experience this?

  17. Klaus Runggaldier says:

    @Kevin Thanks. A reboot fixed the problem, just like you said.

  18. Robert R says:

    @Kevin, any additional info about the 2012R2 wmi leak issue?

  19. Dave M2 says:

    Testing this at the moment as I have been having this problem for ages with agents going offline after a hour or so on 2012 R2 DCs.

    Also, getting WMI leak issues and then WMI stops responding as well and issues getting all DC Performance counters.

  20. Jon Sykes says:

    Hey Dave, we are seeing the same issue! If you look at the events collected on the DCs, you will see that the system is out of memory, or that the paging file is too small for the WMI query to complete, or that no more threads can be created in the system.

  21. tom says:

    What event logs are you looking at showing the system is out of memory or paging file is too small? I don’t believe this is the same issue I am seeing but all 3 of my 2012 R2 DCs keep having the Health Service stop despite having this hotfix installed.

  22. Rickard says:

    The issue for the DNS monitoring is now fixed in the Windows Server 2012 R2 update rollup: May 2014.
    http://support.microsoft.com/kb/2955164

  23. Anonymous says:

    There was an issue when you monitored DNS server roles on Windows Server 2012 R2 servers. The DNS PowerShell

  24. Still getting WMI errors on 2012 R2 domain controllers after KB2955164 is installed.

  25. Keithk2 says:

    I am having the same issue, but with an agent on server 2008 (non-dc). I recycle the mma service and the heartbeat begins again, log starts to process for about a few seconds, and the heartbeat alert closes, but in a few minutes the log stops and another
    heartbeat alert is generated. I have 2000 other agents without this issue. I have tried to reinstall the agent from scratch and removed the agent completely from the DB, but the same problem occurs. Any suggestion on how I can troubleshoot this or what I might
    be able to do? Any known issues like this on a 2008 non-dc server that I am not aware of?

  26. Keithk2 says:

    Ok I resolved my issue. Looks like a CA Wiley Agent was installed and may have been competing with the SCOM agent. Unistalled the CA Wiley agent and the SCOM agent started to communicate with the management server without issue for the first time in a
    while.

  27. Daniel says:

    Thanks Kevin, worked for me also like a charm 🙂

  28. Per J says:

    Hi, I’m seeing the same issue on some 2012 DC:s. Does anyone know if there’s a fix available for this version?

Skip to main content