Note: a workaround to provide relief to your users is to disconnect sessions instead of logging them off.
Update 111012: GOOD NEWS! The hotfix for this issue has been rolled up in the QFE version of MS11-077. Install this using /b:sp2qfe.
Update 110928: There is no news on a public version of the hotfix yet. I expect to know more in two to three weeks from now.
Update 110926: Going through tons of mail now… I expect this to take the remainder of today. But, it’s good to be back! 8) Another customer sent me this (thanks Torsten!): “We have uninstalled Adobe Acrobat Reader 10.1 and reinstalled Acrobat Reader 9.4.6 and everything is fine now.”.
Update 110915: Another customer mentioned he suspects a relation with updating from Adobe Reader 10.0 to 10.1.0, this seems rather correlated to the starting of the crashes (thanks Daniel!). Perhaps this applies to your situation as well! Please share any workarounds you find, since other customers can benefit from this as well.
Update 110912-2: One of our customers (thanks Marcus!) told me that the issue went away in their farm after they removed Adobe Free Reader X and implemented Acrobat Reader 9. I hope this will help some of you out there.
Update 110912: Customer feedback is still coming in, thank you all! I am discussing things on this internally, to see if and how there is a possibility of releasing the fix for this issue publicly. Keep the business cases coming!
Update 110909: More and more customers are contacting me for this problem. I have almost got 20 customers now, of varying in size, contract, and severity of impact. Please keep the data coming so I can make my business case a compelling one!
Update 110905: Various other customers contacted me regarding this STOP error, and it appears this is more wide-spread than I initially thought. I am trying to convince Engineering to make an exception, and release this as a standard hotfix. This requires a very solid business case though, and thus I am asking you, my audience, to please contact me with the impact this has on your environment, and create a case with Microsoft Support. Some initial details that would be very helpful to me are: # of servers hit, # of users impacted, and the frequency the servers crash with. The more cases I have, the better!
Update 110818: The hotfix package is ready. I’ll be shipping it off to the customer shortly.
Update 110804: The latest instrumentation addresses the problem: no machines have crashed since installing this. I am now working to get the final package ready. Note: this will not be publicly available.
Update 110802: Things are starting to look good on this, as the machines equipped with the sixth round of instrumentation are not hitting the bugcheck until now, as was confirmed by our customer. 🙂
After various customers have contacted me on this (thank you!), I’m starting to realize that not everybody has the luxury of a Premier Support agreement, and certainly not the luxury of an Extended hotfix Support Agreement (EHSA). If you do not have this, this basically means you will not be able to obtain the instrumentation packages I talked about below, nor can you obtain the final hotfix – if and when we get one. 🙁 Even though I would very much like to see things differently, this is the way it is: this comes with running a product in Extended Support phase. The best way out of this would of course to upgrade to Win2008, or better, Win2008R2, as these are both in Mainstream Support. I do understand however that this is not something that can be done overnight.
Since the problem here is timing related, the best thing to do is to update win32k.sys to the latest QFE, implement the latest Kernel QFE, update any graphics drivers, and update any Citrix components you may be running. This is the best I can do for anyone without EHSA…
Update 110725: Two customers contacted me through the blog, who experience this issue as well. It seems that a recent security update (KB2555917) may act as the catalyst for the issue to occur. Today I have sent the fifth instrumented win32k to our customer actively working on this, in order to obtain further information on the root cause.
Recently we’ve seen a couple of customers hitting this STOP 0x8E in win32k.sys. Research shows that we’ve seen this STOP error various times over the past few years. The root cause was never addressed: all other cases were resolved by updating some drivers or reinstalling machines. Obviously, this changed timings so we did not see this anymore in those cases. We now finally have a customer whom we’ve filed a hotfix request for. Let me know if you have this issue too, as there must be more customers out there hitting this!
The stack shows:
00 ac802ba8 bf84a41b win32k!xxxRedrawWindow+0x4c
01 ac802c00 bf83c69d win32k!xxxDestroyWindow+0x20f
02 ac802c0c bf8b7aeb win32k!HMDestroyUnlockedObject+0x1c
03 ac802c20 bf8b7e9c win32k!DestroyThreadsObjects+0x72
04 ac802c64 bf8b673f win32k!xxxDestroyThreadInfo+0x206
05 ac802c70 bf8b759e win32k!UserThreadCallout+0x4b
06 ac802c8c 8094c3d2 win32k!W32pThreadCallout+0x3a
07 ac802d18 8094c765 nt!PspExitThread+0x3b2
08 ac802d30 8094cab7 nt!PspTerminateThreadByPointer+0x4b
09 ac802d54 808897ec nt!NtTerminateThread+0x71
0a ac802d54 7c94847c nt!KiFastCallEntry+0xfc
We are currently running with an instrumented win32k.sys, to gather more information needed to determine and resolve the root cause.
Note: since this is Win2003 (in Extended Support), you will need an Extended Hotfix Support Agreement to obtain the hotfix.