Put a BlackBox (Black Box) on your server!

So something I've been recommending to my customers for a while is to have the equivalent of an in flight data recorder on their server. You can do this with Perfmon with circular logging and it isn't that hard to set up.

Why? Well take for example this scenario. You just got a call from one of your users that said the server was incredibly slow - you log on and everything looks fine. The user says yeah, it's ok now but what happened?

Well, if it happens a couple more times - especially if someone or some automated process is waking you up in the middle of the night - you're probably going to want to get to the bottom of this, right? Well, why wait until the problem happens again? Because you don't have any data. Well now you can.

What you want to do is set Perfmon up so it ALWAYS runs. Keep a log of say 300 MB, 500 MB, or maybe a gigabyte of history. Set it up to start every time the machine starts up. And set it up to overwrite the log. This will always keep a history (similar to your event log) of what was just going on with the server in question.

Here's how:

So the first thing you want to do is to have some counters, right? Wll which ones should you pick? Here is a template you can use. This is preloaded with all my personal favorites. If you know me very well then you know I teach a class from time to time called Vital Signs, which is all about learning performance monitor and what the various counters mean. This text file linked above are all the counters you'd need to solve 95%+ of the perf issues in the world.

So what do you do with it? First create a subfolder on C:\ (or whatever drive you want) and call it perflogs - if it isn't already there. Then put the counters.txt file from above into that folder. Then all you need to do is type:

logman create counter BlackBox -cf c:\perflogs\counters.txt -si 05:00 -f bincirc -o c:\Perflogs\Blackbox.blg -a --v -max 500

What this will do is create a BLG (or binary logging file) in the perflogs subfolder. It will take a snapshot of all the counters in that counters.txt file every five minutes. It will run untill it hits 500MB and then it will just append to the file. So it will never grow beyond that size.

Then, all you need to do is start the log. You can type:

logman -start BlackBox

Now, here is the trick - this will keep that counter running until the machine reboots. So if you want it to keep running, put it into a startup script.

Then you'll be able to look back into the log (to see what happened after someone calls and complains) by stopping the log either from inside of perfmon or from the command line by typing:

logman -stop BlackBox

Then copy the blackbox.blg file to your computer, start the blackbox back up again - and troubleshoot as normal.

(if you're looking for advice on how to interpret perfmon counters, standby for a quick-tips post from me coming up later this month - or better yet, ask your Microsoft TAM about getting you into a Vital Signs class)

You can find a copy of this at my personal blog as well at https://www.9z.com/2011/05/put-a-blackbox-black-box-on-your-server.html