How To Effectively Capture Windows Memory Dumps (Pt 1: Using DebugDiag)



Written by Richard Case, Premier Field Engineer.


This is the first article in a series of blog posts on collecting Microsoft Windows memory dumps for specific scenarios. Each part will detail some of the tools and techniques that can be used to capture a memory dump for that scenario. These scenarios are based on real issues experienced by our customers that have been resolved by members of the Microsoft Premier Mission Critical team.

Scenario #1: A Process Exceeds a Resource Threshold

The first scenario we are going to look at is capturing a memory dump when a process exceeds a threshold for a specific resource (memory, to be specific). Suppose after collecting some perfmon data you see the following:

clip_image001

Private Bytes’ for w3wp#2 (the green line) grew from 1.6 GB at 01:59:03 to 15.3 GB at 01:59.18. That’s 13.5 GB in 15 seconds! The process is being a little greedy clip_image002

You also see that ‘# Bytes in all Heaps’ (the highlighted line) grows in line with ‘Private Bytes.’ The screenshot below is for the same case but a different data capture:

clip_image003

This indicates that the increase in memory usage is associated with managed (i.e. .NET) memory.  So what are some of the options for getting a memory dump of a process in this scenario?

Memory Dump Option 1: Using DebugDiag

The “Debug Diagnostics Tool,” better known as DebugDiag, is a great tool for capturing memory dumps in specific situations including a crash or hang of a process or even a memory leak. The latest version is 1.2 and it can be downloaded from here.

For this scenario we would create a Performance rule which allows us to a capture a dump when a value for a specified performance counter exceeds a value we define. Open DebugDiag and click Add Rule… and choose Performance:

clip_image004

We want to capture a dump when the memory usage of a process exceeds a threshold. One of the easiest ways to determine the memory usage of process (apart from using Task Manager or Process Explorer clip_image002[1]) is to use performance counters, specifically the Process object. So let’s choose Performance Counters (we’ll cover the other options in different posts) as the rule type:

clip_image005

Next we need to define the list of counters we want to monitor and the threshold value that must be exceeded for an action to be taken. Click Add Perf Triggers… and choose the counter and instance you want to monitor. In this scenario we are going to use the Private Bytes counter from Process object for the w3wp instance we are interested in. If you don’t know what instance of w3wp you are interested in or the issue can happen in any instance of w3wp, you can monitor each instance by adding multiple perf triggers.

For each of the counters selected you need to set the threshold and the time the counter must be above the threshold before any action is taken. For each of the counters added, select it and choose Edit Thresholds…:

clip_image006

The values for this need to be determined based on the perfmon data you previously collected for your specific scenario. You want sufficiently high values so that dumps aren’t triggered on very brief spikes. In our case we can see that process is above 10GB for at least 30 seconds so we are going to use a threshold of 10GB for at least 5 seconds.

Once you have set all the thresholds click next to define what process or processes you actually want to dump. These are called the Dump Targets.

Click Add Dump Target to select the type of target you want to add. You have have a number of options to choose from here. If there is only one instance of the process you want to dump then you should select Executable as the target type and then select the process name from the list.

Note 1: if you have more than once instance of your process running then selecting it will cause dumps to be taken from all running instances of that process at the time the threshold was exceeded (even though you selected a specific instance in the Dump Target dialog box). This will cause all instances of that process to be paused whilst a dump is generated. If this is an issue then one of the options talked about in a later part of this scenario may be more suitable.

If you are targeting IIS you may also use the All active IIS/COM+ related processes or Web application pool.  Let’s assume we have one instance of w3wp.exe, so we choose Executable and select w3wp.exe from the list:

clip_image007

When you select the dump target, you’ll be asked to configure how many  dumps you would like to capture. This allows you to take a number of user dumps (in a series) at timed intervals after the threshold has been exceeded. This is very useful if you want to see what has changed in the process over a period of time.

Note 2: If you only want to take 1 user dump then ensure you enter 1 for Stop after generating.

Also it’s worth noting the comment on the screen. If you are dumping a .NET application (including a ASP.NET application) then you should select the specific options marked:

clip_image008

Lastly, give your rule a meaningful name and a location to store the dumps to. You can then decide to activate the rule now or at a later time. If you decide to NOT activate it now, you can activate it at a later date by right clicking the rule in the list and choosing Activate Rule.

If a dump is created, the UserDump Count column on the rule list will be updated to show how many dumps have been taken and the rule will be deactivated once it has taken the specific number of dumps:

clip_image009

What happens if we want to dump a specific instance of a process when there are multiple instances running? We’ll cover that in part 2 of this scenario.

Comments (0)

Skip to main content