Kernel-mode dump analysis

I’ve already covered the different types of memory dump in a previous blog entry, so this is a quick dip into how we manually trigger a bugcheck to create a memory dump on demand, and also how we can take a look inside the kernel of a running OS without crashing it.


Crash Landing

In the event of a hung server, it may be desirable to generate a memory dump manually – all we do here is to deliberately invoke a bugcheck, and the normal error handling takes place for dumping the contents of physical RAM to the pagefile, then extraction from there to MEMORY.DMP on the next restart.

There are many ways to achieve this, but in the case of a total hang, failure to logon and/or unresponsiveness over the network, it limits the choices somewhat.


The classic method is “crash on CTRL-SCROLL”, where a PS/2 keyboard was required (along with a registry setting) and the SCROLL LOCK is hit twice whilst holding down the right-hand CTRL key.
This caused a problem when there was a general shift away from the limited value PS/2 ports towards USB keyboards and mice, as this is no longer the same I/O controller.

To get around this problem for Windows Server 2003 (RTM and SP1) there was a hotfix package created, which replaces KBDHID.SYS to allow this to work – this is included in SP2.
The hotfix did not get produced for XP, and did not get put into Vista either… but it is available for Windows Server 2008 SP2, and from Windows 7 onwards.
Ref: Forcing a System Crash from the Keyboard


To enable manual dumps via CTRL-SCROLL LOCK with a PS/2 keyboard:
Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters
Name: CrashOnCtrlScroll
Data: 1

To enable manual dumps via CTRL-SCROLL LOCK with a USB keyboard (where supported):
Path: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\kbdhid\Parameters
Name: CrashOnCtrlScroll
Data: 1


Some servers, specifically blade servers, do not have keyboards attached to trigger a dump by this method – but they can have a “Non-Maskable Interrupt” (NMI) button which is a hardware method to achieve the same result – there are also some devices that provide a “virtual NMI button” through a web interface or agent, which gets a kernel mode driver to trigger it manually.

To enable a bugcheck when the NMI button is pressed:
Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl
Name: NMICrashDump
Data: 1

A bugcheck initiated with CTRL-SCROLL LOCK will have a STOP code ox 0xE2 (MANUALLY_INITIATED_CRASH), while one triggered by NMI will be 0x80 (NMI_HARDWARE_FAILURE).

Trying to analyze these dumps for a cause of the crash is obviously pointless, we know it was done on demand by the user.

Pre-Crash Checks

The bugcheck procedure requires a pagefile to which the contents of physical memory are written – before Windows Vista this had to be present on the boot volume (%systemdrive%), but now this is no longer a limitation, however the size of this must be large enough to accommodate the dump file and the destination must have the same amount of free disk space.

So a Windows Server 2008 machine with 32GB RAM producing a complete memory dump would require a page file of 32GB+50MB, plus the same amount of free disk space in the destination folder (default is %systemroot%\MEMORY.DMP).

The same server producing a kernel memory dump would probably be okay with 2GB+50MB – this will definitely suffice for an x86 server as this is the upper limit for the size of the kernel address space plus overhead for the dump file, but theoretically an x64 server could have a 128GB kernel address space, so it is possible it would need 32GB+50MB as the kernel address space could consume almost all the physical memory.

Personally, I’ve not seen a kernel memory dump bigger than 1GB, even from an x64 system.

A Crash-less Crash Dump

One of the tools that Mark Russinovich created is LiveKd – this allows you to run a kernel debugger locally without crashing the server, so you can take a look inside the kernel even if you’ve not boot in DEBUG mode.

A recent update now allows this tool to run on x64 systems, and systems with >4GB RAM installed.

The tool requires the Debugging Tools for Windows, and as it works by taking a “snapshot” of the kernel and does not freeze it, some of the data cannot be relied upon for accuracy (try tuning a car engine whilst it is being driven!).

The Live Debug

Live debugging requires a separate machine running the debugger, connected to the target machine (debuggee) through one of the supported methods – classically a COM port, but now this is possible over firewire, USB and even named pipes in the case of virtual machines.

Whilst the debugger is in control of the debuggee, the debuggee is frozen – so bear this in mind if you ever do this on a production machine, this server will not respond to anything until it is told to resume by the debugger issuing a ‘g’ (go) command.

The debuggee must be started in DEBUG mode – this is controlled through BOOT.INI before Windows Vista, and through BCDEDIT more recently.

The debugger must have access to symbols for the debuggee, at the absolute minimum it needs to be able to make sense of the “nt” module as this is the kernel – without that there is no chance to work out where the data structures are.

As an example, here is how we would set up a virtual machine in Hyper-V running Windows Server 2008 to be live debugged…

In the Settings of the virtual machine you need to specify a name for the pipe which the host will use to communicate with the VM – here I used the name “w2k8com1” which will lead to the named pipe path “\\.\pipe\w2k8com1” on the host.
(If the debug is being done remotely then the server name is used in place of the dot.)

Fig 1 - VM settings 

Within the virtual machine itself I now need to enable DEBUG mode and tell the kernel to use COM1 at 115200 baud, so from an elevated command prompt enter the following commands:

bcdedit /debug on
bcdedit /dbgsettings serial debugport:1 baudrate:115200

You can verify debug mode is enabled by typing bcdedit – you should see the debug setting set to Yes in the summary.

You can verify the debug settings by typing bcdedit /dbgsettings – you should see the settings entered above reported.

For a more detailed look at the options available, check out this MSDN page.

Set the symbols path on the host – the easiest method is to create a system environment variable named _NT_SYMBOL_PATH and then enter a string allowing a local cache and the upstream server as Microsoft’s public symbol server:



Now start up an elevated WinDbg on the host (if it is not elevated you get an access denied message trying to connect to the named pipe).

Click File / Kernel Debug
On the COM tab set the fields as below and then click OK:
- Baud Rate = 115200
- Port = \\.\pipe\w2k8com1
- Pipe = [checked]
Click Yes on the prompt to save the workspace, and now the debugger will sit waiting for the debuggee to send messages or for the user to instruct it to break in

Note that we can’t break into the VM yet as we haven’t started in DEBUG mode – so reboot the VM and watch the debugger window, you will get something like this:

Microsoft (R) Windows Debugger Version 6.12.0001.591 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Waiting to reconnect...
Connected to Windows Server 2008/Windows Vista 6001 x86 compatible target at (Tue Dec 22 10:31:07.543 2009 (UTC + 1:00)), ptr64 FALSE
Kernel Debugger connection established.
Symbol search path is: srv*C:\*
Executable search path is:
Windows Server 2008/Windows Vista Kernel Version 6001 MP (1 procs) Free x86 compatible
Built by: 6001.18000.x86fre.longhorn_rtm.080118-1840
Machine Name:
Kernel base = 0x81614000 PsLoadedModuleList = 0x8172bc70
System Uptime: not available

In WinDbg you can either click Debug / Break, hit CTRL-BREAK on the keyboard, this will freeze the debuggee and give you control, with the following message:

Break instruction exception - code 80000003 (first chance)
*                                                                             *
*   You are seeing this message because you pressed either                    *
*       CTRL+C (if you run kd.exe) or,                                        *
*       CTRL+BREAK (if you run WinDBG),                                       *
*   on your debugger machine's keyboard.                                      *
*                                                                             *
*                   THIS IS NOT A BUG OR A SYSTEM CRASH                       *
*                                                                             *
* If you did not intend to break into the debugger, press the "g" key, then   *
* press the "Enter" key now.  This message might immediately reappear.  If it *
* does, press "g" and "Enter" again.                                          *
*                                                                             *
816cc514 cc              int     3


Now you can take a look around the system and see what is started, get a summary of the virtual memory, etc. – however be aware that we probably don’t have many of the symbols cached yet, so it is often useful to enable “noisy” symbol loading and then force a reload of all the modules, this way you get the majority of the load delays out of the way right at the start.

You will see the status *BUSY* in the bottom left corner, and each symbol (.PDB) file being sought – depending on the speed of the Internet connection this can take a few minutes:

0: kd> !sym noisy
noisy mode - symbol prompts on
0: kd> .reload /f
Connected to Windows Server 2008/Windows Vista 6001 x86 compatible target at (Tue Dec 22 10:39:05.287 2009 (UTC + 1:00)), ptr64 FALSE
SYMSRV:  ntkrpamp.pdb from 1771779 bytes - copied        
DBGHELP: nt - public symbols 
Loading Kernel Symbols
SYMSRV:  halmacpi.pdb from 74221 bytes - copied        
DBGHELP: hal - public symbols 
SYMSRV:  kdcom.pdb from 3804 bytes - copied        
DBGHELP: kdcom - public symbols 

SYMSRV:  peauth.pdb from 181517 bytes - copied        
DBGHELP: peauth - public symbols 
SYMSRV:  c:\\secdrv.pdb\7578144C39C4468394EF84F01549113A3\secdrv.pdb not found
SYMSRV: not found
DBGHELP: secdrv.pdb - file not found
*** ERROR: Module load completed but symbols could not be loaded for secdrv.SYS
DBGHELP: secdrv - no symbols loaded
SYMSRV:  tcpipreg.pdb from 11607 bytes - copied        
DBGHELP: tcpipreg - public symbols 

Loading User Symbols

Loading unloaded module list

0: kd> !sym quiet
quiet mode – symbol prompts on

Now we have symbols sorted, we can start to look around - !vm will give you a virtual memory overview and the list of processes with their virtual sizes in pages and Kb:

0: kd> !vm
*** Virtual Memory Usage ***
Physical Memory:      130724 (    522896 Kb)
Page File: \??\C:\pagefile.sys
   Current:   1048576 Kb  Free Space:   1048572 Kb
   Minimum:   1048576 Kb  Maximum:      4194304 Kb
Available Pages:       61283 (    245132 Kb)
ResAvail Pages:       106796 (    427184 Kb)
Locked IO Pages:           0 (         0 Kb)
Free System PTEs:     427195 (   1708780 Kb)
Modified Pages:         2429 (      9716 Kb)
Modified PF Pages:      2424 (      9696 Kb)
NonPagedPool Usage:        0 (         0 Kb)
NonPagedPoolNx Usage:   3870 (     15480 Kb)
NonPagedPool Max:      95231 (    380924 Kb)
PagedPool 0 Usage:      3867 (     15468 Kb)
PagedPool 1 Usage:      1906 (      7624 Kb)
PagedPool 2 Usage:        41 (       164 Kb)
PagedPool 3 Usage:        24 (        96 Kb)
PagedPool 4 Usage:        77 (       308 Kb)
PagedPool Usage:        5915 (     23660 Kb)
PagedPool Maximum:    523264 (   2093056 Kb)
Session Commit:         2427 (      9708 Kb)
Shared Commit:          5539 (     22156 Kb)
Special Pool:              0 (         0 Kb)
Shared Process:         1629 (      6516 Kb)
PagedPool Commit:       5922 (     23688 Kb)
Driver Commit:          1801 (      7204 Kb)
Committed pages:       59129 (    236516 Kb)
Commit limit:         382336 (   1529344 Kb)

Total Private:         35993 (    143972 Kb)
         0284 lsass.exe         4883 (     19532 Kb)
         04cc svchost.exe       2889 (     11556 Kb)
         042c svchost.exe       2733 (     10932 Kb)
         01e0 svchost.exe       2718 (     10872 Kb)
         06d8 vmicsvc.exe       2047 (      8188 Kb)
         07a4 ntfrs.exe         1880 (      7520 Kb)
         03f4 LogonUI.exe       1648 (      6592 Kb)
         054c svchost.exe       1544 (      6176 Kb)
         06b4 spoolsv.exe       1245 (      4980 Kb)
         03fc svchost.exe       1171 (      4684 Kb)
         0440 SLsvc.exe         1100 (      4400 Kb)
         0748 dfsrs.exe          953 (      3812 Kb)
         0758 dns.exe            876 (      3504 Kb)
         0474 svchost.exe        716 (      2864 Kb)
         06cc vmicsvc.exe        706 (      2824 Kb)
         0710 vmicsvc.exe        633 (      2532 Kb)
         06ec vmicsvc.exe        633 (      2532 Kb)
         06fc vmicsvc.exe        631 (      2524 Kb)
         039c svchost.exe        600 (      2400 Kb)
         027c services.exe       587 (      2348 Kb)
         0640 taskeng.exe        565 (      2260 Kb)
         0774 ismserv.exe        527 (      2108 Kb)
         0358 svchost.exe        506 (      2024 Kb)
         01b8 svchost.exe        483 (      1932 Kb)
         0354 dfssvc.exe         479 (      1916 Kb)
         0420 svchost.exe        410 (      1640 Kb)
         028c lsm.exe            406 (      1624 Kb)
         01e8 csrss.exe          395 (      1580 Kb)
         0004 System             381 (      1524 Kb)
         0214 csrss.exe          337 (      1348 Kb)
         024c winlogon.exe       321 (      1284 Kb)
         021c wininit.exe        302 (      1208 Kb)
         04b0 svchost.exe        260 (      1040 Kb)
         01bc svchost.exe        222 (       888 Kb)
         020c svchost.exe        134 (       536 Kb)
         01a4 smss.exe            72 (       288 Kb)

As this is a live debug, we have access to all physical memory, so we can set the context to a specific process and look in its user mode space – first we need the process object reference, so let’s pick on LSASS.EXE:

0: kd> !process 0 0 lsass.exe
PROCESS 884aed90  SessionId: 0  Cid: 0284    Peb: 7ffdd000  ParentCid: 021c
    DirBase: 1f7c90e0  ObjectTable: 8eddfb28  HandleCount: 943.
    Image: lsass.exe

Now we know the process object we can switch the debugger context to this process and force a reload of the symbols at the same time:

0: kd> .process /p /r 884aed90
Implicit process is now 884aed90
.cache forcedecodeuser done
Loading User Symbols

We can take a look at the Process Environment Block to see what modules are loaded, the command line, window title and environment the process was started with, using the !peb command:

0: kd> !peb
PEB at 7ffdd000
    InheritedAddressSpace:    No
    ReadImageFileExecOptions: No
    BeingDebugged:            No
    ImageBaseAddress:         00700000
    Ldr                       77164cc0
    Ldr.Initialized:          Yes
    Ldr.InInitializationOrderModuleList: 00361500 . 03959380
    Ldr.InLoadOrderModuleList:           00361480 . 03959370
    Ldr.InMemoryOrderModuleList:         00361488 . 03959378
            Base TimeStamp                     Module
          700000 47918d7c Jan 19 06:41:16 2008 C:\Windows\system32\lsass.exe
        770a0000 4791a7a6 Jan 19 08:32:54 2008 C:\Windows\system32\ntdll.dll
        76d40000 4791a76d Jan 19 08:31:57 2008 C:\Windows\system32\kernel32.dll
        76800000 4791a64b Jan 19 08:27:07 2008 C:\Windows\system32\ADVAPI32.dll

        73700000 4549bda2 Nov 02 10:42:58 2006 C:\Windows\system32\rasadhlp.dll
        73580000 4791a74a Jan 19 08:31:22 2008 C:\Windows\system32\rpchttp.dll
        734d0000 4791a6ba Jan 19 08:28:58 2008 C:\Windows\system32\dssenh.dll
        736e0000 4791a775 Jan 19 08:32:05 2008 C:\Windows\system32\cscapi.dll
    SubSystemData:     00000000
    ProcessHeap:       00360000
    ProcessParameters: 00360d90
    CurrentDirectory:  'C:\Windows\system32\'
    WindowTitle:  'C:\Windows\system32\lsass.exe'
    ImageFile:    'C:\Windows\system32\lsass.exe'
    CommandLine:  'C:\Windows\system32\lsass.exe'
    DllPath:      'C:\Windows\system32;C:\Windows\system32;C:\Windows\system;C:\Windows;.;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\'
    Environment:  003607e8
        CommonProgramFiles=C:\Program Files\Common Files
        PROCESSOR_IDENTIFIER=x86 Family 6 Model 23 Stepping 6, GenuineIntel
        ProgramFiles=C:\Program Files


To understand what difference it makes to set the context to a specific process, here is the output for 1 thread returned by !process 884aed90…

Before (in the kernel context):

        THREAD 888dc030  Cid 0284.0878  Teb: 7ff8e000 Win32Thread: 00000000 WAIT: (Executive) UserMode Non-Alertable
            8879839c  NotificationEvent
        IRP List:
            88477590: (0006,01d8) Flags: 00060070  Mdl: 00000000
        Not impersonating
        DeviceMap                 8b008748
        Owning Process            884aed90       Image:         lsass.exe
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      14673          Ticks: 71 (0:00:00:01.109)
        Context Switch Count      311            
        UserTime                  00:00:00.000
        KernelTime                00:00:00.000
        Win32 Start Address 0x74de1b33
        Stack Init 926fc000 Current 926fbb80 Base 926fc000 Limit 926f9000 Call 0
        Priority 11 BasePriority 9 PriorityDecrement 0 IoPriority 2 PagePriority 5
        ChildEBP RetAddr  Args to Child             
        926fbb98 816cb3bf 888dc030 888dc0b8 8170c920 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
        926fbbdc 81668cf8 888dc030 88798340 88477590 nt!KiSwapThread+0x44f
        926fbc30 8186058d 8879839c 00000000 00000001 nt!KeWaitForSingleObject+0x492
        926fbc64 81860cba 00000103 88798340 049bea00 nt!IopSynchronousServiceTail+0x251
        926fbd00 8184a98e 882df448 88477590 00000000 nt!IopXxxControlFile+0x6b7
        926fbd34 8166ba7a 00000c08 00000f48 00000000 nt!NtDeviceIoControlFile+0x2a
        926fbd34 770f9a94 00000c08 00000f48 00000000 nt!KiFastCallEntry+0x12a (FPO: [0,3] TrapFrame @ 926fbd64)
WARNING: Frame IP not in any known module. Following frames may be wrong.
        049beb10 00000000 00000000 00000000 00000000 0x770f9a94


After (in the process context):

        THREAD 888dc030  Cid 0284.0878  Teb: 7ff8e000 Win32Thread: 00000000 WAIT: (Executive) UserMode Non-Alertable
            8879839c  NotificationEvent
        IRP List:
            88477590: (0006,01d8) Flags: 00060070  Mdl: 00000000
        Not impersonating
        DeviceMap                 8b008748
        Owning Process            884aed90       Image:         lsass.exe
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      14673          Ticks: 71 (0:00:00:01.109)
        Context Switch Count      311            
        UserTime                  00:00:00.000
        KernelTime                00:00:00.000
        Win32 Start Address netlogon!NlWorkerThread (0x74de1b33)
        Stack Init 926fc000 Current 926fbb80 Base 926fc000 Limit 926f9000 Call 0
        Priority 11 BasePriority 9 PriorityDecrement 0 IoPriority 2 PagePriority 5
        ChildEBP RetAddr 
        926fbb98 816cb3bf nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
        926fbbdc 81668cf8 nt!KiSwapThread+0x44f
        926fbc30 8186058d nt!KeWaitForSingleObject+0x492
        926fbc64 81860cba nt!IopSynchronousServiceTail+0x251
        926fbd00 8184a98e nt!IopXxxControlFile+0x6b7
        926fbd34 8166ba7a nt!NtDeviceIoControlFile+0x2a
        926fbd34 770f9a94 nt!KiFastCallEntry+0x12a (FPO: [0,3] TrapFrame @ 926fbd64)
        049bea08 770f8444 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
        049bea0c 74f41f34 ntdll!ZwDeviceIoControlFile+0xc (FPO: [10,0,0])
        049beb10 76431693 mswsock!WSPSelect+0x364 (FPO: [Non-Fpo])
        049beb90 753c6b2c WS2_32!select+0x494 (FPO: [Non-Fpo])
        049becc8 753c70c4 DNSAPI!Recv_Udp+0xd7 (FPO: [Non-Fpo])
        049bedb0 753c7591 DNSAPI!Send_AndRecvUdpWithParam+0x1d8 (FPO: [Non-Fpo])
        049bee5c 753c7478 DNSAPI!Send_AndRecv+0x95 (FPO: [Non-Fpo])
        049beee4 753c7334 DNSAPI!Query_Wire+0xed (FPO: [Non-Fpo])
        049beefc 753c30c6 DNSAPI!Query_SingleNamePrivate+0x83 (FPO: [Non-Fpo])
        049bef08 753c4c4b DNSAPI!Query_SingleName+0x1d (FPO: [Non-Fpo])
        049bef2c 753c4929 DNSAPI!Query_AllNames+0xa9 (FPO: [Non-Fpo])
        049bef50 753ca956 DNSAPI!Query_Main+0x7b (FPO: [Non-Fpo])
        049bef6c 753ca8b7 DNSAPI!Query_InProcess+0x68 (FPO: [Non-Fpo])
        049befc8 753c3b34 DNSAPI!Query_PrivateExW+0x30c (FPO: [Non-Fpo])
        049bf004 753cb3a0 DNSAPI!Query_Shim+0xbb (FPO: [Non-Fpo])
        049bf02c 753cbc67 DNSAPI!Query_Private+0x1f (FPO: [Non-Fpo])
        049bf068 753cbc18 DNSAPI!Faz_PrivateEx+0x51 (FPO: [Non-Fpo])
        049bf084 753cbbe6 DNSAPI!Faz_Private+0x18 (FPO: [Non-Fpo])
        049bf0a0 753d97ef DNSAPI!Faz_Simple+0x30 (FPO: [Non-Fpo])
        049bf260 753d3e61 DNSAPI!Faz_CollapseDnsServerListsForUpdate+0x45 (FPO: [Non-Fpo])
        049bf5b8 753dcb94 DNSAPI!Update_Private+0x105 (FPO: [Non-Fpo])
        049bf690 753dcc1c DNSAPI!modifyRecordsInSetPrivate+0x112 (FPO: [Non-Fpo])
        049bf6b8 74da8a9d DNSAPI!DnsModifyRecordsInSet_UTF8+0x20 (FPO: [Non-Fpo])
        049bf8bc 74da8d32 netlogon!NlDnsUpdate+0x252 (FPO: [Non-Fpo])
        049bf8d8 74da9f10 netlogon!NlDnsRegisterOne+0x1c (FPO: [Non-Fpo])
        049bf900 74daccde netlogon!NlDnsScavengeOne+0x77 (FPO: [Non-Fpo])
        049bf94c 74de1bd1 netlogon!NlDnsScavengeWorker+0x1a7 (FPO: [Non-Fpo])
        049bf964 76d84911 netlogon!NlWorkerThread+0x9e (FPO: [Non-Fpo])
        049bf970 770de4b6 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])
        049bf9b0 770de489 ntdll!__RtlUserThreadStart+0x23 (FPO: [Non-Fpo])
        049bf9c8 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])

I marked the first stack frame in kernel mode in both cases so you can see the difference in the debugger’s behaviour – from the “kernel only” view the virtual address 0x770f9a94 is in the user-mode address space of a process (remember 0x00000000-0x7fffffff is user-mode on x86 systems by default).

When we set the context to a specific process we are then able to show its user-mode portion, and the stack looks completely different.

It is possible for thread stacks to be paged out, both user and kernel-mode, in which case the debugger will report this to you as “stack not resident”.

Skip to main content