Get Visibility into your Exchange Environment with OMS

By Priscilla 'Nini' Ikhena, Program Manager – Microsoft OMS Log Analytic team

Are you interested in easily troubleshooting issues in your Exchange environment? A little while ago, we made available near-real time performance data collection in OMS, and now you are better able to track metrics (which is important for managing Exchange) in your environment in addition to logs.

I recently set up a small Exchange environment in Azure and have been tracking different metrics in OMS using search queries based on the Exchange Management Pack monitors, and now you can do the same! Before you begin searching, be sure to add the necessary event logs and enable the performance counters you intend on collecting data for.

Some of the queries I’ve been tracking:

Average percentage of time that the processor is executing application or operating system processes (should be less than 75% on average)

Counter Name: Process\% Processor Time
Query: Type=Perf ObjectName="Process" CounterName="% Processor Time" | measure avg(Average) by Computer | where AggregatedValue > 75
Note: If the aggregated average value goes above 75, one or more logs will get generated, which will then increase the ‘0’ log count on my dashboard, thus highlighting the tile! 

Recovery Action Failed

Log Name: Microsoft-Exchange-ManagedAvailability/RecoveryActionLogs
Query: Type=Event Source=Microsoft-Exchange-ManagedAvailability EventLog: "Microsoft-Exchange-ManagedAvailability/RecoveryActionLogs" 

MS Exchange Frontend Transport Service has not been running for a period of time

Log Name: Application
Query: Type=Event EventLog: Application Source: "MSExchangeFrontEndTransport"

The connection between the Client Access server and the Mailbox server failed

Log Name: Application
Query: Type=Event EventLog=Application Source=ActiveSync EventID=1022 

The Availability service could not successfully send a proxy Web request to another instance of the Exchange Availability service that is running in a different Active Directory site or forest.

Log Name: Application
Type=Event EventLog=Application EventID=4002 Source=MSExchange Autodiscover

A setting in the Web.config file was not valid and has been reset to the default value

Log Name: Application
Query: Type=Event EventLog=Application EventID=1033 Source=ActiveSync 

Percentage of the free usable space on my disk drive

Counter Name: LogicalDisk\% Free Space
Query: Type=Perf ObjectName=LogicalDisk CounterName="% Free Space” 

Low Disk Space – Free Disk Space is less than 10%

Counter Name: LogicalDisk\% Free Space
Query: Type=Perf ObjectName: "LogicalDisk" "% Free Space" | measure avg(Average) by Computer | where AggregatedValue < 10 

Average number of bytes transferred to or from the disk during write or read operations

Counter Name: LogicalDisk\Avg. Disk Bytes/Transfer
Query: Type=Perf ObjectName=LogicalDisk CounterName="Avg. Disk Bytes/Transfer" 

Amount of Virtual memory in use

Counter Name: Memory\% Committed Bytes In Use
Query: Type=Perf ObjectName=Memory CounterName="% Committed Bytes In Use"  | measure avg(Average) by Computer 

Amount of physical memory available for running processes

Counter Name: Memory\Available MBytes
Query: Type=Perf ObjectName=Memory CounterName="Available MBytes"

Percentage of elapsed time processor spends in User Mode

Counter Name: Processor\% User Time
Query: Type=Perf ObjectName=Processor CounterName="% User Time" 

Rate at which bytes are sent and received over each adapter

Counter Name: Network Interface\Bytes Total/sec
Query: Type=Perf ObjectName="Network Interface" CounterName="Bytes Total/Sec"

LDAP Search Time is beyond the warning threshold

Counter Name: MSExchange ADAccess Domain Controllers\LDAP Search Time
Query: Type=Perf ObjectName="MSExchange ADAccess Domain Controllers" CounterName="LDAP Search Time" |measure avg(Average) by Computer | where AggregatedValue > 50 

LDAP Read Time is beyond the warning threshold

Counter Name: MSExchange ADAccess Domain Controllers\LDAP Read Time
Query: Type=Perf ObjectName="MSExchange ADAccess Domain Controllers" CounterName="LDAP Read Time" |measure avg(Average) by Computer | where AggregatedValue > 50 

Exchange ActiveSync could not access a mailbox on a Mailbox server because the Mailbox server is offline.

Log Name: Application
Query: Type=Event EventLog=Application EventID=1023 Source=ActiveSync                                                                                                                   

Length of output packet queue in packet

Counter Name: Network Interface\Output Queue Length
Query: Type=Perf ObjectName="Network Interface" CounterName="Output Queue Length" 

Memory leak occurs

Counter Name: Process\Private Bytes
Query: Type=Perf ObjectName=Process CounterName="Private Bytes" 

Client RPC Average Latencies are very high

Counter Name: MSExchange RpcClientAccess \RPC Averaged Latency
Query: Type=Perf ObjectName= "MSExchange RpcClientAccess" CounterName="RPC Averaged Latency"|measure avg(Average) by Computer | where AggregatedValue > 250 

Getting data on Message Tracking Report

Counter Name: MSExchange Message Tracking\Get-MessageTrackingReport Task Executed
Query: Type=Perf ObjectName="MSExchange Message Tracking" CounterName="Get-MessageTrackingReport Task Executed"
Counter Name: MSExchange Message Tracking\Get-MessageTrackingReport Task Executed/Sec
Query: Type=Perf ObjectName="MSExchange Message Tracking" CounterName="Get-MessageTrackingReport Task Executed/Sec" 

The Exchange Transport service is rejecting message submissions due to memory consumption higher than the configured threshold

Log Name: Application
Query: Type=Event EventLog: Application EventID=15007 Source=MSExchangeTransport 

Outlook Web Access was unable to read or update some of its configuration settings

Log Name: Application
Query: Type=Event EventLog: Application EventID=64 Source="MSExchange OWA" 

Exchange Direct Push has detected that the configuration value for the minimum heartbeat interval is set to a value that is too low

Log Name: Application
Query: Type=Event EventLog: Application EventID=1011 Source=ActiveSync 

Unable to add an email address because it is invalid

Log Name: Application
Query: Type=Event EventLog: Application EventID=1 Source=InternetProxy 

The database engine lost one page of corrupted data

Log Name: Application
Query: Type=Event EventLog: Application EventID=500 Source=ESE 

MSExchangeMailSubmission, There is no available Hub Transport server in the local site

Log Name: Application
Query: Type=Event EventLog: Application EventID=1008 Source=MSExchangeMailSubmission 

Outlook Web Access is not available for one of the mailboxes in a mailbox database

Log Name: Application
Query: Type=Event EventLog: Application EventID=57 Source="MSExchange OWA"

I then went on to save these search results and added them to my dashboard:

Note: I’ve set Thresholds on these tiles so a tile gets highlighted whenever it has an unusual number of logs. For example, if the ‘Exchange ActiveSync could not access a mailbox on a Mailbox server’ tile above reads greater than 0, this would mean there was an instance where this happened in the Exchange environment which may lead messages not getting delivered.

You can do this by selecting “Customize” at the bottom of the Dashboard page, then selecting the tile you’re interested in and then “Edit”:

 

Additionally, a quick and easy way to add these search queries to your Saved Searches after adding the right events and counters, would be to simply copy and paste the PowerShell script below into Windows PowerShell ISE. Be sure to download the Armclient command line tool prior to running this code. More information on Armclient available here – ArmClient – A command line tool for Azure API:

armclient login

$api = "2015-03-20" 

#getSubscription

$allSubscriptions = armclient get /subscriptions?api-version=$api | out-string | ConvertFrom-Json

$uiPrompt = "Select a subscription.`n"

$count = 1

foreach ($subscription in $allSubscriptions.value) {

    $uiPrompt += "$count. " + $subscription.displayName + "(" + $subscription.subscriptionId + ")`n"

    $count++

}

$answer = (Read-Host -Prompt $uiPrompt)  1

$subscription = $allSubscriptions.value[$answer].subscriptionId

#Write-Host $subscription

#getWorkspace

$allWorkspaces = armclient get /subscriptions/$subscription/providers/Microsoft.OperationalInsights/workspaces?api-version=$api | out-string | ConvertFrom-Json

$uiPrompt = "Select a workspace.`n"

$count = 1

 

foreach ($workspace in $allWorkspaces.value) {

    $uiPrompt += "$count. " + $workspace.name + "(" + $workspace.id + ")`n"

    $count++

}

$answer = (Read-Host -Prompt $uiPrompt)  1

$workspace = $allWorkspaces.value[$answer].name 

 

if ($allWorkspaces.value[$answer].id -notcontains $resourcegroup)

{

    $WSId=$allWorkspaces.value[$answer].id

    $tempvar=$WSId.Substring($WSId.IndexOf("resourcegroups")+15,$WSId.Length$WSId.IndexOf("resourcegroups")15)

    $resourcegroup=$tempvar.Substring(0,$tempvar.IndexOf("/"))

    Write-Host "New resource group determined: $resourcegroup"

}

 

#list of search queries

$searchList = @(

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'% Processor is greater than 75%', 'Query':'Type=Perf ObjectName=Process CounterName= \`"Processor Time\`" | measure avg(Average) by Computer | where AggregatedValue > 75 '  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Recovery Action Failed', 'Query':'Type=Event Source=Microsoft-Exchange-ManagedAvailability EventLog= \`"Microsoft-Exchange-ManagedAvailability/RecoveredActionLogs\`" '}",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'MS Exchange Frontend Transport Service has not been running for a period of time', 'Query':'Type=Event EventLog=Application Source= \`"MSExchangeFrontEndTransport\`" '  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'The connection between the Client Access Server and the Mailbox server failed', 'Query':'Type=Event EventLog=Application Source=ActiveSync EventID=1022'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Availability service could not successfully send a proxy Web request', 'Query':'Type=Event EventLog=Application EventID=4002 Source =  \`"MSExchange Autodiscover\`" '}",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'A setting in the Web.config file was not valid and has been reset to the default value', 'Query':'Type=Event EventLog=Application EventID=1033 Source=ActiveSync'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Percentage of the free usable space on my disk drive', 'Query':'Type=Perf ObjectName=\`"LogicalDisk\`" CounterName= \`"% Free Space\`" '  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Free Disk Space is less than 10%', 'Query':'Type=Perf ObjectName=\`"LogicalDisk\`" CounterName= \`"% Free Space\`" |measure avg(Average) by Computer | where AggregatedValue < 10'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Average number of bytes transferred to or from the disk during write or read operations', 'Query':'Type=Perf ObjectName=\`"LogicalDisk\`" CounterName= \`"Avg. Disk Bytes/Transfer\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Amount of Virtual memory in use', 'Query':'Type=Perf ObjectName=\`"Memory\`" CounterName= \`"% Committed Bytes in Use\`" | measure avg(Average) by Computer'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Amount of physical memory avaiable for running processes', 'Query':'Type=Perf ObjectName=\`"Memory\`" CounterName= \`"Available MBytes\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Percentage of elapsed time processor spends in User Mode', 'Query':'Type=Perf ObjectName=\`"Processor\`" CounterName= \`"% User Time\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Rate at which bytes are sent and received over each adapter', 'Query':'Type=Perf ObjectName=\`"Network Interface\`" CounterName= \`"Bytes Total/Sec\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'LDAP Read Time is beyond the warning threshold', 'Query':'Type=Perf ObjectName=\`"MSExchange ADAccess Domain Controllers\`" CounterName= \`"LDAP Read Time\`" | measure avg(Average) by Computer | where AggregatedValue > 50'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'LDAP Search Time is beyond the warning threshold', 'Query':'Type=Perf ObjectName=\`"MSExchange ADAccess Domain Controllers\`" CounterName= \`"LDAP Search Time\`" | measure avg(Average) by Computer | where AggregatedValue > 50'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Exchange ActiveSync could not access a mailbox on a Mailbox server because the Mailbox server is offline', 'Query':'Type=Event EventLog=Application EventID=1023 Source=ActiveSync'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Length of output packet queue in packet', 'Query':'Type=Perf ObjectName=\`"Network Interface\`" CounterName= \`"Output Queue Length\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Memory leak occurs', 'Query':'Type=Perf ObjectName=\`"Process\`" CounterName= \`"Private Bytes\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Client RPC Average Latencies are very high', 'Query':'Type=Perf ObjectName=\`"MSExchange RpcClientAccess\`" CounterName= \`"RPC Averaged Latency\`" |measure avg(Average) by Computer | where AggregatedValue > 250'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Getting data on Message Tracking Report', 'Query':'Type=Perf ObjectName=\`"MSExchange Message Tracking\`" CounterName= \`"Get-MessageTRackingReport Task Executed/Sec\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'High Message Queuing to Hub Transport servers', 'Query':'Type=Perf ObjectName=\`"MSExchangeIS\`" CounterName= \`"Messages Queued for Submission\`" |measure avg(Average) by Computer | where AggregatedValue > 20'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Exchange transport service is rejecting message submissions', 'Query':'Type=Event EventLog=Application EventID=15007 Source=MSExchangeTransport '  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Outlook Web Access was unable to read or update config settings', 'Query':'Type=Event EventLog=Application EventID=64 Source= \`"MSExchange OWA\`"'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Exchange Direct Push has detected that the config value for the min heartbeat interval is too low', 'Query':'Type=Event EventLog=Application EventID=1011 Source=ACtiveSync'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Unable to add an email address because it is invalid', 'Query':'Type=Event EventLog=Application EventID=1 Source=InternetProxy'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'The database engine lost one page of corrupted data', 'Query':'Type=Event EventLog=Application EventID=500 Source=ESE'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'There is no available Hub transport server in the local site', 'Query':'Type=Event EventLog=Application EventID=1008 Source=MSExchangeMailSubmission'  }",

    "{'etag': 'W/`"datetime\'2015-09-22T23%3A35%3A35.3182423Z\'`"', 'properties': { 'Category': 'MS Exchange ', 'DisplayName':'Outlook Web Access is not available for one of the mailboxes in a mailbox database', 'Query':'Type=Event EventLog=Application EventID=57 Source=MSExchangeOWA'  }"

    )

  $myId = 0

    foreach ($query in $searchList) {

        $url = "/subscriptions/$subscription/resourceGroups/$resourcegroup/providers/Microsoft.OperationalInsights/workspaces/$workspace/savedsearches/exchange$myId" + "?api-version=$api"

        armclient put $url $query

        $myId++

    }

For your convenience, here’s a list of counters and event logs I added to my workspace before searching:

Performance Counters Event Logs

Process\% Processor Time

Application

MSExchange Database\I/O Database Reads Average Latency

Microsoft-Exchange-ManagedAvailability/RecoveredActionResults

MSExchange Database\I/O Database Writes Average Latency

LogicalDisk\% Free Space 

LogicalDisk\Avg. Disk Bytes/Transfer

Memory\% Committed Bytes In Use

Memory\Available MBytes

Processor\% User Time

Network Interface\Bytes Total/sec

MSExchange ADAccess Domain Controllers\LDAP Search Time

MSExchange ADAccess Domain Controllers\LDAP Read Time

Network Interface\Output Queue Length

Process\Private Bytes

MSExchange RpcClientAccess \RPC Averaged Latency

MSExchange Message Tracking\Get-MessageTrackingReport Task Executed

MSExchangeMessageTracking\Get-MessageTrackingReport Task Executed/Sec

What’s Next?

Moving forward, our thoughts are around building an OMS Exchange Solution that will you help you better manage your Exchange environment by providing assessment, suggestions and monitoring across all portfolios.

I hope this has been helpful! Enjoy searching and please post feedback or questions on UserVoice or leave a comment below!

– Nini