It seems that everywhere we look today we have a Process for dealing with something. From fire drills to hurricanes, training for a marathon or handling grief. Everything is a Process. Computers are no different... Except they are.
When troubleshooting something that is broken on a computer it comes down to determining which process is at fault. It's always a process that ends up being the bad guy. But what is a process anyway?
A process is simply a container. The gym bag that carries all the running equipment. The bucket at the beach that carries the shovel, rake and sand. Nothing more. Nothing less. More specifically this container holds a set of resources used by threads that are executed by a program… our shovels and rakes.
Each process includes:
Virtual Address Space (VAD)
List of open Handles (handle table)
Thread (at least one)
As noted above each process is unique and gets assigned their own Process ID (PID). When troubleshooting bad behavior on a system we can often trace the source via this PID.
Why the PID? We can find Process ID in several different locations and reports from a system. This is very helpful when we have multuple processes running on the system with the same Process name. A common example of this is svchost.exe. If you open a command prompt and type in just tasklist you will see several instances of svchost.exe running on a system:
We can also see this from Taskman:
When gathering data on an issue we often ask for an MPS reports or MSDT report along with possibly a Performance Monitor, Xperf or Memory dump. It may seem like a lot of data to gather and for each issue we may only look at a few of the file in an MSDT but many times the key link in all this information is the PID. Especially when dealing with issues like Svchost.exe where there are multiple instances of an executable. *.nfo files, tlist.txt, pstat.exe all list the related PID
Let's say we have a leak in svchost. We can see this from perfmon:
In Perfmon that is svchost#2. Initially I was looking at Non-Paged Pool behavior to see if anyone was interesting. Since svchost#2 decided to play that role, I wanted to find out what process ID svchost was running under. We just need to add counters and add that counter for svchost#2. There will be a flat line at a specific value. That's our PID:
From there we can dig into PID 1016 with Tasklist. Originally just typing Tasklist in at the command prompt gave a generic rundown. But if you do a Tasklist /svc you will get a breakout of what is running under a specific PID:
This, at times, can point to an application that we can isolate or update quickly.
If we need to dig into it further we could possibly get a memory dump of the system when its in state. Here again we can work of the process ID:
kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 80a02a60 Cid: 1016 Peb: 00000000 ParentCid: 0000
DirBase: 00006e05 ObjectTable: 80a03788 TableSize: 150.
One very important caveat to using the process ID to hunt down the source of a misbehaving application is that the PID will probably change when you reboot (only core system processes get the same PID on reboot) or if you restart the process or parent process. It is critical when tracking down memory leaks and hanging applications that the data we gather is all from the same boot instance. Data gathered across multiple boots may invalidate our hunt for the bad process.
Identifying the process we're investigating is one of the first things we try to do...it can also be the most difficult depending on the data we have at hand. Once the blame is firmly set on a specific process … well that's when the fun begins.