Here on the Performance Team we constantly deal with issues caused by incorrect performance tuning of various servers. This will generally manifest itself in system or process slowness or memory or CPU bottlenecks. I have decided to publish a short series on basic guidelines you can use when provisioning a new server or tuning an old one. First, we should address hardware scaling.
Windows Server 2008 R2 only supports 64-bit processors, so obviously that is the first step. This should not be a problem as 64-bit processors have been widely available for several years and in fact it is difficult to find a server class processor nowadays that is not 64-bit. Don’t worry however, as most 32-bit processes will work fine on 64-bit hardware, and if they don’t then most likely they were not written following proper 32-bit coding guidelines. I personally run 64-bit Windows 7 on my home machines, and I have yet to find a program that I want to use that does not work.
When choosing a processor it is advised to get the most modern version, and the most recent stepping of whichever version you choose. For instance, in our previous post, we discussed an issue that is mitigated if you use the later stepping of the Intel processor.
When it comes to speed, don’t necessarily believe the numbers; processors from different manufacturers and generations do not generally provide an apples to apples comparison. To find out which CPU will really work for you will require some research to see how they perform in real-world situations. Scaling up versus scaling out is also something you need to be cognizant of. What I mean by this is that scaling up the speed of your processor may be more advantageous than scaling out to more processors. Some loads will benefit from having more threads running, whereas some will benefit from having a smaller number of faster processors. Basically, if you are becoming CPU bound, then scaling up will most likely help you out more than scaling out. Research has shown that two CPUs will not generally be as fast as a single CPU with twice the clock speed, at least not on an app by app basis.
Cache can also make a huge difference in the performance of a given processor. Getting a processor with a large L2 or L3 cache will generally provide better performance than a simple jump in clock speed. What is the difference between a Core 2 Quad processor and a similar spec’ed Xeon? You guessed it, more cache.
Recommending RAM is a bit of a two-edged sword. You don’t want to recommend installing too much RAM as that wastes money, but having too little is even worse. Problem is, recommending how much RAM to use is really nothing more than an educated guess. As a rule of thumb, the more RAM the better, but I doubt your average CFO is going to greenlight installing 64 GB of RAM in every server.
So, the trick is to install enough RAM so you never really deplete it all, while still having as little left over as you can. Obviously, a comprehensive performance baseline is out of the scope for this post, but a good rule of thumb is to simply monitor Working Set with Perfmon. Working Set is the amount of your virtual memory that has been used ‘recently’. In this case ‘recently’ pretty much means it is still in RAM as opposed to having been paged out. If your Working Set starts getting to be a decent percentage of your RAM size, you might benefit from more RAM. As long as you don’t actually deplete the RAM you are technically okay, but I personally start getting concerned if Working Set spikes go to over 80% of RAM size on a regular basis.
Application recommendations are of course going to trump any ad hoc testing you may do. If a vendor says you need X amount of RAM, it is best to install at least that much just to be on the safe side.
The pagefile is the other piece of virtual memory that we need to be concerned about. The pagefile is really just a file on the hard disk that is set up so that it operates like RAM. Problem is, RAM is fast and hard disks are slow. So, having to read or write to the hard disk when the system needs to satisfy a memory request can be very time consuming.
To speed access to the paging file, it is recommended to place it on a separate physical disk than the operating system. Better yet, create multiple paging files on different disks, or even on multi-disk arrays for real speed. You may have read here before that we need a paging file on the system disk in order to catch things like memory dumps, and that is true. However it is not normally necessary to have every machine configured to be able to capture full size memory dumps unless you are in a troubleshooting situation and it is recommended. Usually, you can keep the system drive set up with a small 1 or 2 GB pagefile and still be able to catch even a full kernel dump if needed.
The total size of the pagefiles you might need is another one of those things that you will get many different opinions on, so I am no going to offer one. To make this determination on your own, you can set up your pagefiles to System Managed and run the machine under a normal load for a few days. Again, use Perfmon to monitor the system and keep an eye on Paging File – %Usage. If your percentage of pagefile usage gets too high, or especially if your pagefile expands, then you most likely need to set the total size to be larger.
NOTE: The recommendations above pertaining to RAM and pagefile are just simple guidelines and may need to be tweaked based on various factors including page faults/second, disk idle and cache bytes.
That is all for now, next time we will discuss physical disks, the disk subsystem and power management.
Until next time,