One of our common issues is troubleshooting application crashes (for example, the Print Spooler or a third-party application). These crashes usually result in the infamous Dr. Watson error.
First, let’s discuss terminology. A crash is when something experiences a fault and has no choice but to exit. In the case of an user mode process that generally means a Dr. Watson popup and application exit, and in the case of the kernel, a Bugcheck. A crash can be caused by something as simple as a value being set to zero when a function is expecting a non-zero response, or trying to access a section in memory that has either become damaged or that belongs to another process.
Determining the cause of an application crash can be very simple, or extremely complicated – depending on the failure. You can do something as simple as viewing the Dr. Watson log, or you can do a full debug. Debugging is very much an art, and a complicated one at that, so we are only going to touch on the really basic concepts in this discussion. First, let’s go the easy route.
When a process crashes, Dr. Watson will generally catch it and make a log entry as well as catch a dump of the faulting process. The log file is called Drwtsn32.log and it typically in the following directory: C:\Documents and Setting\All Users\Application Data\Microsoft\Dr Watson. This location is configurable using the Dr Watson configuration tool (drwtsn32.exe)
This file may very well be several megabytes in size, since the log may be appended to each time Dr. Watson handles an exception. The log is written top to bottom, so the last entry in the file is the most recent crash. Each entry starts off telling what process crashed and the basic type of crash it was:
Application exception occurred:
App: C:\Program Files\Test\SampleApp\SMPL.exe (pid=2348)
When: 12/8/2006 @ 14:19:10.261
Exception number: c0000005 (access violation)
OK – so the first thing to understand is what this information is telling us. It is fairly straightforward, but does contain very important basic data – the Name and Process ID (PID) of the application, the exception code (c0000005 in this case) and the exact time & date of the failure.
Under this will be quite a few pages of stack information, and unless you wrote the application probably won’t be too much help for you. But what you can do that sometimes helps is to find the actual fault itself. The easiest way to do this is to go to the bottom of the log and then search upward by doing a Find on the word ‘fault’. You should find an entry similar to this:
FAULT ->0040191c 8b4020 mov eax,[eax+0x20] ds:0023:00000020=????????
Below that will be the stack for the faulting thread, and may point to at least basically what is causing the issue. For instance, if the stack referenced a third-party .DLL, it is very possibly the cause of the crash. So, in the following example for instance, a module called CMGR caused a crash:
*—-> Stack Back Trace <—-*
WARNING: Stack unwind information not available. Following frames may be wrong.
ChildEBP RetAddr Args to Child
71ab4617 3d8151ec 71ac4028 71ab9448 f4840f56 CMGR+0x191c
8b55ff8b 00000000 00000000 00000000 00000000 0x3d8151ec
If looking at the Drwtsn32.log doesn’t help, we need to go a bit deeper, and actually debug the application fault. So to do this, we will need to install the Debugging Tools for Windows. Simply install the tools using the defaults and set your symbol path to the default listed on the web page. Now comes our choice in how you wish to actually do the debug. When an application crashes, it will generally be caught by Dr. Watson and create a dump in the same location as the Drwtsn32.log file.
This is the dump file we can open using the Debugging Tools for Windows. To do this, run Windbg.exe from the Debugging Tools folder. Click File – Open Crash Dump and browse to the .DMP file in question. After you open the file, the initial output looks similar to this:
Microsoft (R) Windows Debugger Version 6.7.0005.0 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [Z:\Dump\SVCHOST_2ndChance.dmp] User Mini Dump File with Full Memory: Only application data is available Comment: '2nd_chance_AccessViolation_exception_in_SVCHOST.EXE_running_on_<MACHINE_NAME>' Symbol search path is: SRV*C:\SYMBOLS*http://msdl.microsoft.com/download/symbols Executable search path is: Windows Server 2003 Version 3790 (Service Pack 1) UP Free x86 compatible Product: Server, suite: TerminalServer SingleUserTS Debug session time: Fri Apr 13 19:50:28.000 2007 (GMT-5) System Uptime: 28 days 6:45:18.796 Process Uptime: 20 days 23:59:39.000 ................................................................................................................... Loading unloaded module list ................ This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (1954.4708): Access violation - code c0000005 (first/second chance not available) eax=80070002 ebx=00000073 ecx=00000007 edx=000000dd esi=00000002 edi=7ffffffe eip=77bd4817 esp=0214f12c ebp=0214f59c iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 msvcrt!_woutput+0x6aa: 77bd4817 66833800 cmp word ptr [eax],0 ds:0023:80070002=????
OK – so what information do we get immediately? We can tell that this dump is for a crash of the SVCHOST.EXE process. Also take note of the Symbol Search Path – in this instance, I have a local symbol store that I am populating from the Microsoft Public Symbol server. Finally we can tell what platform this dump came from – a 32-bit Windows Server 2003 SP1 machine.
So where do we begin? The first command we run is !analyze -v. This tells the debugger to run an analysis in verbose mode. In the text input box at the bottom of the screen, type !analyze -v and hit Enter.
This may take a few minutes to complete the command as the debugger has to download and match symbols. When it completes, we get the following output which displays information about the thread, and the stack. It also includes a pointer to the module that caused the fault:
******************************************************************************* * * * Exception Analysis * * * ******************************************************************************* *** ERROR: Symbol file could not be found. Defaulted to export symbols for wuaueng.dll - FAULTING_IP: msvcrt!_woutput+6aa 77bd4817 66833800 cmp word ptr [eax],0 EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff) ExceptionAddress: 77bd4817 (msvcrt!_woutput+0x000006aa) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter: 00000000 Parameter: 80070002 Attempt to read from address 80070002 DEFAULT_BUCKET_ID: APPLICATION_FAULT PROCESS_NAME: svchost.exe ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s. READ_ADDRESS: 80070002 BUGCHECK_STR: ACCESS_VIOLATION NTGLOBALFLAG: 0 APPLICATION_VERIFIER_FLAGS: 0 LAST_CONTROL_TRANSFER: from 77bd0f6e to 77bd4817 STACK_TEXT: 0214f59c 77bd0f6e 0214f5b4 5009b680 0214f904 msvcrt!_woutput+0x6aa 0214f5d4 5009fb9c 0214f710 000000d5 5009b680 msvcrt!_vsnwprintf+0x30 WARNING: Stack unwind information not available. Following frames may be wrong. 0214f600 5013f897 0214f710 000000d6 000001ac wuaueng!DllRegisterServer+0x130a 0214f628 5013fc9f 0214f710 000000d6 0214f660 wuaueng!WUAutoUpdateAtShutdown+0xa7e5 0214f664 501404a9 00000000 00000001 0214f6bc wuaueng!WUAutoUpdateAtShutdown+0xabed 0214f8c0 50140101 00000001 00000002 00000000 wuaueng!WUAutoUpdateAtShutdown+0xb3f7 0214f8ec 5016b76c 0214f918 00000000 00000001 wuaueng!WUAutoUpdateAtShutdown+0xb04f 0214f920 5016b911 0528f660 00000000 00000000 wuaueng!WUAutoUpdateAtShutdown+0x366ba 0214f954 5016b975 00000001 00000000 00d36868 wuaueng!WUAutoUpdateAtShutdown+0x3685f 0214f968 5016c266 00d36868 77d04350 00000001 wuaueng!WUAutoUpdateAtShutdown+0x368c3 0214f994 5015b245 00d36868 00003ab8 00000000 wuaueng!WUAutoUpdateAtShutdown+0x371b4 0214fc7c 5015a1f0 020380e8 00003ab8 020380e8 wuaueng!WUAutoUpdateAtShutdown+0x26193 0214fca0 50156c79 020380e8 0483ca80 00003ab8 wuaueng!WUAutoUpdateAtShutdown+0x2513e 0214fccc 500c2f02 00000000 00000000 002580a8 wuaueng!WUAutoUpdateAtShutdown+0x21bc7 0214fd5c 500b9aa6 0214fdec 0214fddc 00d2ec30 wuaueng!DllRegisterServer+0x24670 0214fe00 500a8e82 002580a8 002580a8 500a8261 wuaueng!DllRegisterServer+0x1b214 0214fe28 500b126e 0025818c 0214fe5c 500b1e1c wuaueng!DllRegisterServer+0xa5f0 0214fe34 500b1e1c 00000000 00000018 04c3d228 wuaueng!DllRegisterServer+0x129dc 0214fe5c 500b1fbc 00000000 500b1f8f 0214fec8 wuaueng!DllRegisterServer+0x1358a 0214fe6c 7c83b332 04c3d228 00000001 020b0f80 wuaueng!DllRegisterServer+0x1372a 0214fec8 7c820482 500b1f8f 04c3d228 00000001 ntdll!RtlpWaitOrTimerCallout+0x74 0214feec 7c81a3c5 020b0f80 7c889880 00159598 ntdll!RtlpAsyncTimerCallbackCompletion+0x2d 0214ff44 7c8200fc 7c820455 020b0f80 00000000 ntdll!RtlpWorkerCallout+0x71 0214ff64 7c81a3fa 00000000 020b0f80 00159598 ntdll!RtlpExecuteWorkerRequest+0x4f 0214ff78 7c82017f 7c8200bb 00000000 020b0f80 ntdll!RtlpApcCallout+0x11 0214ffb8 77e66063 00000000 00000000 00000000 ntdll!RtlpWorkerThread+0x61 0214ffec 00000000 7c83ad38 00000000 00000000 kernel32!BaseThreadStart+0x34 FOLLOWUP_IP: msvcrt!_woutput+6aa 77bd4817 66833800 cmp word ptr [eax],0 SYMBOL_STACK_INDEX: 0 FOLLOWUP_NAME: MachineOwner MODULE_NAME: msvcrt IMAGE_NAME: msvcrt.dll DEBUG_FLR_IMAGE_TIMESTAMP: 4243785e FAULTING_THREAD: 00004708 SYMBOL_NAME: msvcrt!_woutput+6aa STACK_COMMAND: ~39s; .ecxr ; kb FAILURE_BUCKET_ID: ACCESS_VIOLATION_msvcrt!_woutput+6aa BUCKET_ID: ACCESS_VIOLATION_msvcrt!_woutput+6aa Followup: MachineOwner
Looking at the output we can see that this crash is an Access Violation (c0000005) which is the most common type of crash. Under the heading STACK_TEXT we can see the stack for the thread that failed. When looking at a thread, you read the data from the bottom up. So if we look at this thread we can see that Kernel32.dll created the basic thread. NTDLL.DLL called the WUAUENG.DLL module which performed a number of actions before faulting while trying to pass data to MSVCRT.DLL. If you look at the debug output, the dump file would indicate that the problem is the MSVCRT.DLL file – however, this is probably a false positive. The real culprit is more likely to be the WUAUENG module – for which we don’t have symbols. In most cases, when you run into a module for which you don’t have symbols, that module is part of a third-party product. So let’s see if we can get any sort of data about that module and who makes it. To get information about a module, you enter lmvm <module name> in the command box:
Note that you don’t have to enter the file extension (.DLL or .SYS etc). So – what did we find out about WUAUENG.DLL?
0:039> lmvm wuaueng start end module name 50040000 5018c000 wuaueng (export symbols) wuaueng.dll Loaded symbol image file: wuaueng.dll Image path: C:\WINDOWS\system32\wuaueng.dll Image name: wuaueng.dll Timestamp: Fri Oct 27 18:09:38 2006 (454291B2) CheckSum: 0014CE53 ImageSize: 0014C000 File version: 188.8.131.5294 Product version: 184.108.40.20694 File flags: 8 (Mask 3F) >Private File OS: 40004 NT Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0000.04b0 CompanyName: Microsoft Corporation ProductName: Microsoft® Windows® Operating System InternalName: wuaueng.dll OriginalFilename: wuaueng.dll ProductVersion: 220.127.116.1194 FileVersion: 18.104.22.16894 built by: dnsrv(wmbla) FileDescription: Windows Update AutoUpdate Engine LegalCopyright: © Microsoft Corporation. All rights reserved
Ah-ha! It turns out that this is not a third-party file – it’s actually the Windows Update AutoUpdate Engine file from Microsoft. Now you have information about where the file is located, and the Timestamp of the file so you know how old the file is. It just happens to be a file for which public symbols are not available (because it is a security component). Knowing which file is causing the fault is most of the battle – by knowing the binary that is having issues, you can work with the provider of the file to get a fix for their component to resolve the issue. In this case, Microsoft is the provider for the file – so we would need to figure out what is going on with the Windows Update Engine.
In case you were wondering, we generated this dump file for a known issue with the Automatic Update service. The fix for this issue is to follow the action plan outline on the WSUS Product Team Blog and update the Automatic Updates client as well as apply a hotfix for MSI.DLL