Clustering Best Practices: Checking Hotfix Compliance

Do you use Microsoft Clustering? Are your clusters configured according to the current Microsoft recommended practices? How can you be sure of the configuration?

Questions, questions. The first is easy, the second you might be unsure of and the third is somewhat more difficult. If you search around Microsoft.com you will find recommendations on how best to configure your cluster, but the thing is that there is no easy way to double-check everything is as it should be, except by making a tedious manual review of all the settings.

While clustering isn't overly complicated, this checking process can be time-consuming, and, as with every manual process, is subject to human error.

So, what to do? In this article, I will be developing a script that takes care of part of the checking, namely auditing the status of the updates on each of the nodes.

UPDATE: My colleague, Ben Pearce, has created a PowerShell version of this script.  Check it out on his blog.

What Will the Script Check?
Well, to ensure your cluster is a stable as possible, there are two things to do in relation to updates:

  • Recommended Updates. Are the Microsoft recommended updates (see below) installed on all nodes? These updates address very common cluster issues, so installing them will help you avoid some known problems.
  • Update Parity. Does each node have the same set of updates installed, recommended or not? As clustered applications could potentially run on any node in the cluster (depending on the configuration), the ideal situation is that the nodes are identical in all aspects of hardware and software, including updates.

The script will check both of these aspects of patching.

The Recommended Updates for Microsoft Clusters
As mentioned above, Microsoft publish a number of lists of updates that are recommended for installing on Cluster servers. The following articles list the recommended updates:

895092 - Recommended hotfixes for Windows Server 2003-based server clusters
923830 - Recommended hotfixes for Windows Server 2003 Service Pack 1- based server clusters
935640 - Recommended hotfixes for Windows Server 2003 Service Pack 2-based server clusters
895090 - Recommended hotfixes for Windows 2000 Service Pack 4-based server clusters

Obviously, you don't need to install these updates, and in general Microsoft do not recommend arbitrarily installing updates. However, the updates in these articles are recommended because they are known to have resolved a lot of issues raised with Microsoft Support Services. So, it isn't enough for our script to just query for a list of the updates installed, but it will also need to check these against the recommended updates.

Script Overview
The general steps that the script will need to implement are:

  • Get a list of nodes in the cluster
  • Connect to the nodes and enumerate the installed updates
  • Check the installed updates against the list of recommended ones
  • Flag any missing updates, or disparity between nodes

Using VBScript together with a couple of WMI calls should be sufficient to take care of this.

The Script
Finally, down to business. Here's a walkthrough of the key points in the finished script:

To keep things simple, we will expect some user input from the command-line. The user needs to supply the cluster name and the file containing the list of recommended updates, which we check for before doing anything else:

strClusterName = WScript.Arguments.Named.Item("cluster") strKBList = WScript.Arguments.Named.Item("KBlist") If(("" = strClusterName) OR ("" = strKBList)) Then    Wscript.Echo "Error parsing command-line. Check Command-line and try again."    wscript.quit End If

From this you can see that for the script to work properly, the user must call it like this:

cscript ClusterHotfixCheck.vbs /cluster:<cluster_name> /KBlist:<KB_list_text_file>

To know which nodes to check we need to connect to the cluster and ask for a list:

Set objWMICLuster = GetObject("winmgmts:{impersonationLevel=impersonate}!\\" & strClusterName & "\root\mscluster") Set collNodeList = objWMICLuster.ExecQuery("Select name from MSCluster_Node")

Here we are using the cluster WMI provider. In this case we are connecting using the user's credentials. WMI allows you to supply alternate credentials for remote connections, but for this example, we take the simple approach.

Next we interrogate each of the nodes:

Set dictFixes = CreateObject("Scripting.Dictionary") For Each objNode in collNodeList intNodeID = intNodeID + 1 Set objWMINode = GetObject("winmgmts:{impersonationLevel=impersonate}!\\" & objNode.Name & "\root\cimv2") Set collFixes = objWMINode.ExecQuery("SELECT HotfixID FROM Win32_QuickFixEngineering") For Each objFix in collFixes   strTempFix = StripPrefix(objFix.HotfixID)   If(dictFixes.Exists(strTempFix)) Then     arrTemp = dictFixes(strTempFix )     arrTemp(intNodeID) = "1"     dictFixes.Remove(strTempFix)     dictFixes.Add strTempFix , arrTemp   Else     arrNodes(9) = Len(objNode.Name)      arrNodes(intNodeID) = "1"     dictFixes.Add strTempFix , arrNodes   End If   Next Next

A dictionary object is used to store details of all of the updates installed on all of the nodes. A normal array would probably do, but the dictionary object has a simple method (Exists) for checking if an update is already in the list. I'm not sure how this is implemented internally (maybe a hash table), but it is almost certainly faster than iterating through the elements of an array in the script to look for a given update. For each update we flag it as installed on a particular node by adding a "1" to the nodes place in a simple array.

Again, WMI is used to gather the data we need. This time the Win32_QuickFixEngineering class gives us the list of installed updates.

The StripPrefix() function is defined later in the script and is used to remove any "Q" or "KB" prefix commonly found in the ID for an update:

Function StripPrefix(strHotFixID) If(1 = InStr(strHotFixID, "Q")) Then   StripPrefix = Mid(strHotfixID, 2, 6) ElseIf(1 = InStr(strHotfixID, "KB")) Then   StripPrefix = Mid(strHotfixID, 3, 6) Else StripPrefix = strHotfixID End If End Function

Now we have a list of what is actually installed, we need to load the list of recommended updates and check if they are in our existing list. Here's where we again take advantage of the ease of searching the dictionary object to check if a specific update is installed somewhere in the cluster:

Set objFSO = CreateObject("Scripting.FileSystemObject") Set objFixList = objFSO.OpenTextFile(strKBList, 1) arrNodes = Array(0,0,0,0,0,0,0,0,0,0) Do While objFixList.AtEndOfStream <> True   strHotFix = objFixList.ReadLine   If("" <> Trim(strHotfix)) Then     If(dictFixes.Exists(strHotfix)) Then       arrTemp = dictFixes(strHotfix)       arrTemp(0) = "2"       dictFixes.Remove(strHotfix)       dictFixes.Add strHotfix, arrTemp     Else       arrNodes(0) = "2"       dictFixes.Add strHotfix, arrNodes     End If   End If Loop objFixlist.Close DisplayOutput()

From the code, you can see that it simply reads a line from the supplied file and uses this as the name of an update to check. So, for the script to work properly, the input file needs to be a simple text file with one KB ID on each line.

This time, we use "2" to indicate that this update is on the recommended list. The DisplayOutput() function is defined later in the script (but not discussed in this post as it's a little messy) and is used to output the results to a text file on the user's desktop. Typical output looks like this:


Typical script output

Patching Status
Determine the status of your patching from the output as follows:

Recommended Updates
"--" is used if an update is on the recommended list and is installed on all nodes.
"XX" is used if an update is recommended but is missing from one or more nodes.

General Updates
"++" is used if an update is not on the recommended list and is installed on all nodes.
"**" is used if an update is not on the recommended list and is missing from one or more nodes.

Specific Nodes and Updates
"0" is used to show that an update is not installed on the given node.
"1" is used to show that the update is installed on the given node.

Known Limitations
The script as it stands has some limitations. Two key ones are:

  • No account is taken of the existing Service Pack level or any roll-up packages, which may make some updates redundant.
  • No account is taken of whether a particular update is actually needed on the cluster. Some of the recommended updates only apply to certain resource types and if you don't use that type, then the update is not needed.

It is left to the reader to implement these improvements if required - I thought I'd leave some fun stuff for you to do!

Download
You can download the full script together with lists of recommended updates from the link below:

ClusterHotfixCheck.zip

[This script is subject to the Microsoft Terms of Use . Essentially, you use it at your own risk! ]

I'd be interested in hearing your experience of using the script, so let me know if you find it useful or have any problems with it.