I'm not going to start a new blog by insulting my favorite Microsoft Solution, Small Business Server, and say supporting that is easy or tame...that's just not accurate. After several years in that realm I can safely say I learned to stay afloat in almost any situation. But after moving to our Performance team I've quickly come to realize there is more to swimming than just a good front crawl. And now that I'm here it's time to dive in and really learn to swim!
Any new team member at one point or another should pose the question " What exactly do we troubleshoot?" and that really is the heart of the Performance team… defining just that. For example, a very common statement we hear is "My server is Slow." Now, let's just get this out of the way right now: "Slow" is a four-letter word. On its own, "Slow", as a description, is useless. It is a good ice-breaker, it gets the conversation started but it is just the beginning of the fun.
What is slow?
What is the expected behavior?
When did this start?
Does it get slow over time?
Does it eventually make the application or server stop responding?
If we reboot does it get better for a while?
How often has this happened?
How many times has it happened?
How long does it take to return?
Can we reproduce the behavior on demand?
Is it happening right now?
The questions are seemingly endless depending on the logic path we follow. But when answered we should have a pretty complete view of the issue at hand. To sync back to the swimming analogy… I'd much prefer to learn to swim in a chlorinated neighborhood pool verses a limitless ocean with unknown variables called sharks, jellyfish and seaweed!
While can be frustrating and tedious to ask, and sound completely repetitive, each twist and turn can point us in different directions. What is slow? "Trying to open MMCs, calc.exe and paint.exe" is different than "Trying to close Office docs." How slow? "Takes 10 seconds" is not the same as "Takes 3 minutes." or "45 minutes." When did this start? "Just now", "3 days ago", "A month", or "Six months." Does it get better on its own or do we have to reboot? If it gets better on its own, it's not a leak or a resource lock. How long does it take to return? "Happens once a month" requires completely different thinking than "happens every 2 hours", or "happens everyday at 5pm."
When it comes to "Slow" server performance, the issues can range from massive network traffic, intensive LDAP queries, physical bottlenecks such as older NICs, slow disks, underperforming driver controllers, CPUs; low memory conditions ranging from lack of sufficient RAM, low disk space for the page file or virtual memory leaks. It can be software doing its job with slightly unanticipated enthusiasm, two processes arm wrestling over bragging rights or that one greedy little kid that takes all the candy out of the candy bowl, stuffing it in his pocket for later. Exchange, SQL, DNS...I'm looking at you! (to their credit those greedy kids do give back the candy they've hoarded if asked.)
So here I am, learning to swim. I'm not afraid of the water, but I'm not jumping into the ocean either. We're going to make sure these lessons take place in a nice community pool where the posted signs never just say "Slow"