Writing Custom Health Rules in SharePoint 2010

I just wrote my first health analyzer rule this weekend and it was a pretty straightforward and painless experience. I just thought I’d share a few tips from the experience.

First, make sure you check out the SDK. We just released the RTM version of it and there is a bunch of great information in there, including lots of details on the creation and deployment of custom health rules.

Second, be patient when you are deploying your health rules. The deployment itself is picked up by the timer job, so after you install it you may have to wait a few minutes before you see your rule in central admin.

From a deployment standpoint, I found the easiest way to do it (as you might expect) was to create a feature and solution. I scoped my feature to the Farm, and I created a feature receiver to add and remove my rule when the feature is installed or uninstalled, as appropriate. This makes the registration code really simple – just two lines of code in my override FeatureInstalled:

 Assembly asm = Assembly.GetExecutingAssembly();

SPHealthAnalyzer.RegisterRules(asm);

 

Third, be aware of the Category you use for your health rule. In addition to taking a bit for the rule to actually be registered, if your rule Category is System then it doesn’t show up in the default view in central admin. The default view specifically filters out rules whose Category is system. Conversely, if you have a rule that you DON’T want to show up by default, then you can set the Category to System to keep it out of sight. Obviously it will still show up when someone looks at an unfiltered view of the rules.

Fourth, it’s worth pointing out where the various health rule properties show up in the UI. Let’s go through each one:

· Summary: this shows up as the Title for your rule, both in the list of installed health rules, as well as when your health rule records an event.

· Explanation and Remedy: when your health rule returns a status of Failure, an entry is made to the Health Reports list in central admin. The Explanation and Remedy fields in that list display whatever strings you return from the corresponding properties in your health rule.

· ErrorLevel: this determines how the entry is “flagged” in the Health Reports list. For my rule I always return Warning, but depending on your circumstances you may flag it as Information, Error, etc.

Fifth, if your job should be run on a timer job (which the vast majority of rules will), make sure you override the AutomaticExecutionParameters method. This is where you set the default values for how often your rule should run, whether it should run on all servers or any one server, etc.

Sixth and finally, to debug your rule you want to attach to the OWSTIMER.EXE process. For both installing your rule and executing your rule, all of the code runs in the timer job. That also means that if you change your code and recompile your rule, make sure you recycle the SharePoint Timer job! If you don’t and you try and step through your code again, you will find your cursor bouncing around in ways that seem completely illogical.

For my particular rule, I decided to do something based on one of the new limits in SharePoint 2010. We recently updated our limits doc and said that content databases in most workloads should not be bigger than 200GB. So I wrote a rule that enumerates all of the web applications and each content database within each web application. It keeps track of each web app in which it finds a content database bigger than 200GB. If it finds one or more web apps in that situation, my override of the Check method returns SPHealthCheckStatus.Failed. That of course then generates a list item in the Health Reports list. In the Remedy, I suggest that site collections be split out of content databases that are larger than 200GB, and I provide a list of all the web applications that have content databases larger than that size. It was a pretty straightforward rule to write and a good exercise for working through the process.

I’ve attached the entire solution, including source code, assembly and wsp so you can use it as a reference if you like, or just take the solution and use the rule in your farm.

HealthContentDB.zip