SharePoint 2007/WCF Services Bug: “Object reference not set to an instance of an object.”

I just finished working through an issue with one of my customers. My customer had written a WCF service to kick off some custom actions on remote SharePoint servers, but they kept getting the old, “Object reference not set to an instance of an object” error. Personally, I love this one, because it’s so easy to debug: all you have to do is figure out which object was null. And why.

Ok, that was sarcasm. And not very well done, at that. Actually, I hate this error because it’s so vague, and because at least half the time I’ve seen it, the faulting method doesn’t actually contain the code that caused the root problem. It just happens to be where everything finally blew up. And in fact, that was the case here.

So that you don’t have to keep thinking to yourself, “Would you please just get to the point!?”, I’m going to give the solution, then the cause, and then I might go a little into the process of how we got there.

The Solution

There are a few possibilities (as always):

  1. Try turning off Verbose logging. This one doesn’t seem to always work, but in certain instances it does. Apparently, if Verbose logging is turned on, SharePoint uses one of the utility methods that we found had the bug related to using StackFrame and StackTrace.
    Now, to be clear: having Verbose logging turned on will not always cause this error. So, you may in fact find that in your code, you don’t have this problem at all. And hey, lucky you! In our brief testing, having Verbose logging turned on for some categories worked fine, while others resulted in the appearance of our friend the “Object reference” error. As the saying goes, YMMV.
  2. Don’t do it. Find an alternative. I hate that this is the answer, but we actually tried option number one and initially got it to work. With further testing, however, we hit the same error again, but this time caused by a separate utility method that did basically the same thing with StackTrace and StackFrame as the first method. And that was with Verbose logging turned off.

That’s it; really. You may get it to work, and please, please, please: if you did encounter this error and still found a way to make the WCF call work (well, the instantiation, actually; we had no trouble getting the WCF call to go through), please let me know how you did it.

Here are two alternatives:

  1. Use the SharePoint 2007 web service API, which is pretty robust.
  2. Use a standard web service hosted inside a SharePoint web app.

In this customer’s case, they wanted to do some special configuration magic that would have been much more difficult without the special configuration file they’d designed. So, they opted for creating a web service hosted inside one of the SharePoint web apps (I’m not going to give too many specifics about their architecture, but the point is that this option worked.) As I said, however, the SharePoint 2007 web service API is fairly robust. With some planning, you will probably be able to accomplish everything you want just by making use of it. A quick search on Bing will locate some really helpful resources for you, but for those pressed for time J, here are some samples:

The Cause

There is a documented bug in SharePoint 2007 with one of the utility methods. The method itself seem innocuous enough: all it does is list out the stack when asked using the .NET StackTrace and StackFrame objects. Unfortunately, due to various factors—most of which I’m not privy—these .NET objects may not be instantiated.

I think (and this is just presumption, which is why I italicized those two words) that StackTrace and StackFrame being null may be related to the fact that most SharePoint 2007 .NET code is just a series of wrappers for unmanaged COM code. Thus, I’m guessing (again, notice the italics) that the StackTrace and StackFrame objects are not instantiated in all circumstances. I mean, they’re not as fundamental as, for instance, the Object class. But I digress. Plus, I’m right at the edge of what I know and what I blatantly don’t, and if I tread any further, that will become painfully obvious.

In any case, due to some strangeness in how the SharePoint classes get instantiated when called by a WCF service, the StackTrace and StackFrame objects don’t get instantiated, but a couple of SharePoint utility methods that use them do get called. And since those utility methods don’t take into account the fact that StackTrace and/or StackFrame might not have been instantiated, they call methods on the null objects, causing the “Object reference…” error to be thrown.

Other Things We Tried

  • We started by making sure the WCF call was making it to the endpoint. Once we established that, we knew we would/should be able to rule that piece out. Besides, the error was happening in the SharePoint code, after a considerable amount of other code had already been called and run.
  • We noticed that the HttpContext and SPContext objects were null, and thought that might have something to do with it. We created context manually by creating an HttpRequest.
  • Since the error was happening inside custom feature-related code, we tried adjusting the settings to not activate any features. We still got the error, but now we could see it actually happening in the SharePoint utility class. This was almost more confusing, since our first thought was, “That doesn’t make any sense—it’s just a method in a utility class that displays StackTrace information. I guess the lesson here is not to overlook what’s right in front of you.

One of the oddest things was that we initially didn’t find any mention of this bug on the web. Actually, my customer was the one that found a brief mention of turning off Verbose logging while searching archived forum postings. This led us to search our internal bug database, where we found the SharePoint bug documented.

And just in case you’re hoping this might be fixed in SharePoint 2007, I’ll let you know that this particular bug was marked, “Postponed”. On the plus side, it is confirmed as fixed in SharePoint 2010.