Tools and tips used for troubleshooting ESM status problems


This blog post continues to explore how Exchange uses WMI in representing the Status and Monitoring node in Exchange System Manager. In this post, we will cover general troubleshooting of ESM "status" problems.

Using the monitoring node, you can view the list of servers in your organization and their current status to ensure that your servers are operating. You can also verify that the connectors you have established between servers are available to transmit messages.

To verify server and connector status: Start Exchange System Manager > Navigate to Monitoring and Status > Tools > Monitoring and Status. Click Status. All connectors and servers in your organization are listed in the details pane. They are identified by administrative group and display a status of Available or Unreachable. Where do I start if the status shows Unreachable?

Having WMI problems?

Start troubleshooting by testing if WMI works. A simple test would be to open the Computer Management console (compmgmt.msc), expand 'Services and Applications', select 'WMI Control', right-click and select 'Properties. There is some basic computer/OS information listed on the General tab that is obtained from WMI. If you cannot see any of that data, you may be having problems with the WMI engine. Ensure that that the WMI services are started and inspect your event logs for errors. If you are not having problems with WMI (failure to connect, event 9098, etc...) try the following:

Run wbemtest.exe.

If wbemtest.exe is not on your path, this could be the problem. The system environment's PATH variable should contain %windir%\system32\wbem. If it doesn't add it and reboot.

In wbemtest, hit Connect, type 'root/cimv2' in the top text field and hit 'Login'. Then hit 'Enum Classes...', leave the text box empty, but select 'Recursive' and hit OK. You should see a list of 600+ objects (count displayed in top left).
If you get an error here, your WMI service is not functioning.

In wbemtest, hit Connect, type 'root/cimv2/applications/exchange'. Then hit 'Enum Classes...', leave the text box empty, but select 'Recursive' and hit OK. You should see a list of about 57 objects (count displayed in top left).

Double-click the ExchangeClusterResource line, then hit the "Instances" button, look for an error dialog or an error message above the list box. Repeat for all classes that start with "Exchange". It is normal for some of the classes to report no instances.

Turn up WMI Logging by going into the Compmgmt.mmc console & restart the winmgmt service. You turn verbose logging by right-clicking 'My Computer', select 'Manage', expand 'Services and Applications', select 'WMI Control', right-click and select 'Properties', click on the 'Logging' tab, select 'Verbose' and add three zeroes to the 'Maximum size' field, making it 65536000.Wbemcore.log will have a lot of interesting information. For more information, refer to:

310315 Troubleshooting monitoring and status in Exchange and in Small Business Server

http://support.microsoft.com/default.aspx?scid=kb;EN-US;310315

288590 Error "0x8004100e" and event ID 9097 occur when you run the System Attendant in Exchange Server 2003

http://support.microsoft.com/default.aspx?scid=kb;EN-US;288590

Run this script on each server. This will tell us what each server has in its own routing table.

set WMI = GetObject("WinMgmts:root/cimv2/applications/exchange")

set objs = WMI.InstancesOf("ExchangeServerState")

for each obj in objs

text = obj.Name + " = " + obj.ServerStateString

if obj.Unreachable then

text = text + " (Unreachable)"

end if

if obj.ServerMaintenance then

text = text + " (Maintenance)"

end if

WScript.Echo obj.Name + " = " + obj.ServerStateString

Next

When running this script on server X, if the server line says "Unreachable", it means that the routing engine of server X cannot connect to its routing master. Whenever a server is unreachable, its state is irrelevant (it may be very stale).

You can run the script from a command prompt also:

cscript wmitest.vbs >%computername%.txt

Winroute

Winroute is a very important tool for inspecting routing info. Run winroute from the problem server

There is some tracing (not much) present that can be gathered using regtrace. For additional information about how to use the Regtrace utility, click the following article number to view the article in the Microsoft Knowledge Base:

238614 How to set up Regtrace for Exchange 2000

http://support.microsoft.com/default.aspx?scid=kb;EN-US;238614

The information the System Attendant gathers is displayed in the userdata field associated with a server object in Winroute. Here is some information about winroute one needs to be aware of in troubleshooting such issues.

The userdata field is 8 bytes long and looks like: 0701000000000000

The first byte (07) is ignored, byte 2 indicates maintenance mode is on/off, bytes 3 - 8 represent: queues, disk, memory, cpu services, cluster in that order.

A value of 81 in the 2nd byte of the User Data field, means the server is in maintenance mode

This is how it would look like in Exchange System Manager

A blank User Data field, means the System Attendant has/did not populate the data

This is how it would look like in Exchange System Manager

Following is an example where everything is good - server is not in maintenance and no resources are being monitored.

This is how it would look like in Exchange System Manager

For more information, refer to:

832281 Link state issues and routing issues in Exchange 2000 Server and in Exchange Server 2003

http://support.microsoft.com/default.aspx?scid=kb;EN-US;832281

Miscellaneous Troubleshooting tips:

  1. Verify RPC (launch event viewer and connect to another server, WMI and DCOM permissions is working
  2. Run EXBPA in health check mode and permissions mode.

Check to see if the computer account for the Exchange server has an explicit Deny permission for the Receive As permission assigned.

For more information on permissions, refer to:

899393 Event ID 1025 is logged two times in the Application log with the "EcGenerateNRN: Error: 0x80070005" and "EcGenerateReadReport: Error: 0x80070005" error codes on a computer that is running Exchange Server 2003 or Exchange 2000 Server

http://support.microsoft.com/default.aspx?scid=kb;EN-US;899393

  1. On a particular exchange server, in ESM under Tools | Monitoring and status |Right Click on "Status" connect to another Exchange server and check to see the status of computer object. Does this work?
  2. Right click on the exchange server, select properties | security tab | advanced options. Make sure the "Allow inheritable permissions from parent to propagate to this object" is checked (it should be).
  3. Open gpedit.msc, computer configuration | windows settings | security settings | local policies | user rights assignment | Access this computer from the network and check to see if all the Users and groups (authenticated users and everyone) that are required exist.
  4. Run telnet <Remote_server_name> 691 to make sure 691 port is opened to the remote Exchange server.
  5. Make sure there are no communication problems with firewalls being in place.
  6. Debug trace data from the routing engine (aka regtrace). Here are the steps for that:
  • On the problem server, stop the routing engine service.
  • From Start/Run, type regtrace and leave the window open temporarily.
  • Go to the following key in regedit: HKLM\Software\Microsoft\MosTrace\CurrentVersion\DebugAsyncTrace
  • Add a reg_multi_sz value called Modules
  • Enter the following values in it on separate lines: RESVC REAPI Routing
  • Switch back to the regtrace application, select all checkboxes on the Traces tab.
  • Select the File option on the Output tab. C:\trace.atf is usually the default and is fine. Specify 50MB for file size.
  • The threading tab shouldn't have anything selected.
  • Click Apply to begin the tracing. Don't click OK otherwise the window will close.
  • Start the routing engine service. Let it run for about 5 minutes.
  • Switch back to the regtrace application. Set it to "No Tracing". Remove all checkmarks on the Traces tab.
  • Delete the Modules value from the registry and send us the trace.atf file for further analysis.

9. Gather DSACLS from a good server and bad server and windiff them.

The server X on startup will connect to the master on 691 and request to attach. The master will then authenticate server X by seeing if the computer account of server X has the "Send As" right on the master's server object. The Exchange Domain servers group will have the "Send As" rights on the master and any members of this group will benefit from that. This group is present on Server Y's ACL list with the proper entry and if the other servers are working okay, then it further supports that the group is okay. In some cases, Server Y to deny Server X the ability to connect to master, it must mean Server Y is seeing some problem with the ACL list that is enough to prevent Server X from connecting. In that case, add Server X explicitly to Server Y's ACL list with Full control.

If you notice in the DSACLS output for Server X there might be a bunch of duplicate ACLs which usually happens if someone does the following:

a) Remove the "Inherit Permissions from Parent" checkbox and select the Copy option

b) Select "Inherit Permissions from Parent" again which will then create a duplicate set of ACLs in addition to the ones created by choosing Copy above.

I have seen instances where having duplicate entries have caused issues.

10. In ESM, navigate to servers-> Server X "property" and under the "security" tab. By default, exchange servers should be under "Exchange Domain Servers" which have the correct privileges. However, manual settings or the fact that if an exchange server in question is a DC may have this privilege denied because of it being under other groups. Check to see if the account serverX$ has "send as" privilege. One can run exchutil /v /all to verify this.

More information:

257265 General troubleshooting for transport issues in Exchange 2000 Server and in Exchange Server 2003

http://support.microsoft.com/default.aspx?scid=kb;EN-US;257265

- Nagesh Mahadev

Skip to main content