[RESOLVED] Win2008R2 RTM/SP1: STOP 0x3B in rdbss!RxFsdCommonDispatch+ad7

Status: Resolved

Update 120531: Yes! The Private is approved. We are releasing the hotfix in the July Release Cycle, under KB2719704.

Update 120510: We are now testing a private at several customers... fingers crossed this fixes this long-runner! :D

Update 120417: Still ongoing, we are now awaiting verifier-enabled dumps for the crash. If you have this issue too, please enable "verifier /volatile /flags 0x9 /adddriver rdpdr.sys rdbss.sys" and send me the resulting dump.

Update 120217: Yesterday I received an instrumented rdpdr.sys to implement on affected machines. To be able to obtain this and help us find the root cause of this, please create a support case and contact me.

Update 120120: Recently I've filed an RFC on this issue, to get Engineering assistance on resolving this. I expect some update later next week.

Update 111115: Two more customers reported this. Since we know that when we enable full Redirector (Rdr) tracing the problem goes away, I figured out some "light-weight" Rdr tracing. Also, we enabled RdpDr tracing since we suspect a race condition between Rdr and RdpDr. If you run into this problem please enable the tracing and contact me:

tracelog.exe -start RdrMup -guid #fc4b0d39-e8be-4a83-a32f-c0c7c4f61ee4 -b 256 -min 256 -max 256 -f %systemroot%\system32\LogFiles\mup.etl -flag 0x0f00000f -level 5
logman create trace RdpDr -ow -o %SYSTEMROOT%\system32\LogFiles\rdpdr.etl -p {73BFB78F-12B5-4738-A66C-A77BCD55FA12} 0x1 0xf -nb 256 256 -bs 256 -mode 0x2 -f bincirc -max 100 -ets

Note: Tracelog comes with the WDK, LogMan is included in the OS.

Update 111109: The first customer hitting this came back: they had the problem again... awaiting further information now.

Update 111108: This issue went silent for some time... Two customers that encountered this stopped having the problem after implementing KBs 2579362 & 2521220. Today, a customer running SP1 contacted me, also encountering this issue.

Update 110808: Still no crash with the instrumentation and tracing running... customer notified me that their environment is still not running at full load because of the holiday season. Now we wait...

Recently, this case came to my attention, although it's been going on for quite some time. Perhaps there are more people out there who have servers experiencing this issue... :) If so, please let me know!

The stack of this problem shows:

2: kd> knL
 # Child-SP RetAddr Call Site
00 fffff880`03b30398 fffff800`018ccca9 nt!KeBugCheckEx
01 fffff880`03b303a0 fffff800`018cc5fc nt!KiBugCheckDispatch+0x69
02 fffff880`03b304e0 fffff800`018f340d nt!KiSystemServiceHandler+0x7c
03 fffff880`03b30520 fffff800`018faa90 nt!RtlpExecuteHandlerForException+0xd
04 fffff880`03b30550 fffff800`019079ef nt!RtlDispatchException+0x410
05 fffff880`03b30c30 fffff800`018ccd82 nt!KiDispatchException+0x16f
06 fffff880`03b312c0 fffff800`018cabb4 nt!KiExceptionDispatch+0xc2
07 fffff880`03b314a0 fffff880`02e172e0 nt!KiBreakpointTrap+0xf4
08 fffff880`03b31630 fffff880`02e35b74 rdbss!RxFsdCommonDispatch+0xad8
09 fffff880`03b31720 fffff880`04b8acd7 rdbss!RxFsdDispatch+0x224
0a fffff880`03b31790 fffff880`018b8271 rdpdr!DrPeekDispatch+0x31f
0b fffff880`03b317e0 fffff880`018b6138 mup!MupiCallUncProvider+0x161
0c fffff880`03b31850 fffff880`018b6b0d mup!MupStateMachine+0x128
0d fffff880`03b318a0 fffff880`013006af mup!MupFsdIrpPassThrough+0x12d
0e fffff880`03b318f0 fffff880`018f794d fltmgr!FltpDispatch+0x9f
0f fffff880`03b31950 fffff880`013006af mfehidk!DEVICEDISPATCH::DispatchPassThrough+0x105
10 fffff880`03b319b0 fffff800`01be8707 fltmgr!FltpDispatch+0x9f
11 fffff880`03b31a10 fffff800`01be8f66 nt!IopXxxControlFile+0x607
12 fffff880`03b31b40 fffff800`018cc993 nt!NtDeviceIoControlFile+0x56
13 fffff880`03b31bb0 00000000`7721f72a nt!KiSystemServiceCopyEnd+0x13

We hit a Debug Breakpoint (0x80000003) because of an APC issue. Currently, we are working on instrumentation that will tell us where the APC problem is located. As soon as we know that, we can start thinking of a fix.

To be continued...