[RESOLVED] WinVista/Win7: Users cannot logon (gpprefcl.dll)

Status: Resolved.

Update 110617: The KB is now public at https://support.microsoft.com/kb/2526870.

Update 110518: For both Windows Vista and Windows 7, the KB will be 2526870. This is scheduled to be released as part of HTP11-06, and will also be in Vista SP3, and Win7 SP2. Contact me if you encounter this issue before the KB is released.

Update 110512: A test hotfix package has been sent to the customer originally reporting this. Stay tuned for the results of this!

Below is some information on an interesting hang scenario my colleague Sonny and I worked on. The symptom is that users cannot logon to their machine.

To find this particular issue, in a Complete Kernel Memory Dump, you can run "!stacks 2 criticalsection". You will then find various threads waiting for the ntdll!LdrLockLoaderLock. If you then search for the thread owning this particular lock, it will show a thread like this:

1: kd> knL
  *** Stack trace for last set context - .thread/.cxr resets it
 # ChildEBP RetAddr 
00 b38b5c50 81cb4372 nt!KiSwapContext+0x26
01 b38b5c94 81c4ff38 nt!KiSwapThread+0x44f
02 b38b5ce8 81e3a894 nt!KeWaitForSingleObject+0x492
03 b38b5d50 81c52c7a nt!NtWaitForSingleObject+0xbe
04 b38b5d50 77795ca4 nt!KiFastCallEntry+0x12a
05 00d7ed38 77795470 ntdll!KiFastSystemCallRet
06 00d7ed3c 7776de5a ntdll!ZwWaitForSingleObject+0xc
07 00d7eda0 7776dd3d ntdll!RtlpWaitOnCriticalSection+0x155
08 00d7edc8 766c45a5 ntdll!RtlEnterCriticalSection+0x152
09 00d7edd8 766c47a3 SHELL32!DirectUI::CritSecLock::CritSecLock+0x14
0a 00d7eeec 766b7f5e SHELL32!kfapi::CFolderDefinitionCache::Load+0x26
0b 00d7efa4 766b8bd1 SHELL32!kfapi::CFolderPathResolver::GetPath+0x8b
0c 00d7f038 766c4f36 SHELL32!kfapi::CFolderCache::GetPath+0x1a5
0d 00d7f09c 766c4e98 SHELL32!kfapi::CKFFacade::GetFolderPath+0x5d
0e 00d7f0bc 766c591a SHELL32!SHGetKnownFolderPath_Internal+0x38
0f 00d7f0d8 766ba0ed SHELL32!SHGetFolderPathEx+0x30
10 00d7f118 6fe07024 SHELL32!SHGetFolderPathW+0xac
WARNING: Stack unwind information not available. Following frames may be wrong.
11 00d7f1c4 6fe07a22 gpprefcl+0x27024
12 00d7f1d4 6fe06bd8 gpprefcl+0x27a22
13 00d7f248 6fe1ec43 gpprefcl+0x26bd8
14 00d7f25c 6fe2b27b gpprefcl+0x3ec43
15 00d7f29c 777713b4 gpprefcl+0x4b27b
16 00d7f2bc 7775909b ntdll!zzz_AsmCodeRange_End
17 00d7f3b4 77759703 ntdll!LdrpRunInitializeRoutines+0x270
18 00d7f638 777594c7 ntdll!LdrpLoadDll+0x49a
19 00d7f8bc 77679355 ntdll!LdrLoadDll+0x22a
1a 00d7f920 77679373 kernel32!LoadLibraryExW+0x252
1b 00d7f934 742adf9c kernel32!LoadLibraryW+0x11
1c 00d7f960 742ade24 gpsvc!LoadGPExtension+0x3a
1d 00d7fb2c 742ad5ca gpsvc!ProcessGPOList+0x3bb
1e 00d7fd70 742a6037 gpsvc!ProcessGPOs+0x1d25
1f 00d7fe78 742a596c gpsvc!ApplyGroupPolicy+0x538
20 00d7fed0 742b5d14 gpsvc!CGroupPolicySession::ApplyGroupPolicyForPrincipal+0x322
21 00d7ff04 7769d0e9 gpsvc!CGroupPolicySession::ApplyGroupPolicyThread+0x32
22 00d7ff10 777716c3 kernel32!BaseThreadInitThunk+0xe
23 00d7ff50 77771696 ntdll!__RtlUserThreadStart+0x23
24 00d7ff68 00000000 ntdll!_RtlUserThreadStart+0x1b

The thread (T1) above is waiting for is the one here (T2), which in turn is waiting for thread T1. This results in a deadlock.

1: kd> knL
  *** Stack trace for last set context - .thread/.cxr resets it
 # ChildEBP RetAddr 
00 89f1ec50 81cb4372 nt!KiSwapContext+0x26
01 89f1ec94 81c4ff38 nt!KiSwapThread+0x44f
02 89f1ece8 81e3a894 nt!KeWaitForSingleObject+0x492
03 89f1ed50 81c52c7a nt!NtWaitForSingleObject+0xbe
04 89f1ed50 77795ca4 nt!KiFastCallEntry+0x12a
05 0128e438 77795470 ntdll!KiFastSystemCallRet
06 0128e43c 7776de5a ntdll!ZwWaitForSingleObject+0xc
07 0128e4a0 7776dd3d ntdll!RtlpWaitOnCriticalSection+0x155
08 0128e4c8 777746e1 ntdll!RtlEnterCriticalSection+0x152
09 0128e500 77759480 ntdll!LdrLockLoaderLock+0xe4
0a 0128e778 77679355 ntdll!LdrLoadDll+0xd0
0b 0128e7dc 776794d3 kernel32!LoadLibraryExW+0x252
0c 0128e7f0 7767950e kernel32!LoadLibraryExA+0x1f
0d 0128e810 766abfd1 kernel32!LoadLibraryA+0xb7
0e 0128e854 766aba42 SHELL32!__delayLoadHelper2+0x55
0f 0128e8e0 7669e62e SHELL32!_tailMerge_ole32_dll+0xd
10 0128ea04 7669ea6b SHELL32!kfapi::CFolderDefinitionStorage::LoadRegistry+0x66
11 0128eb5c 7669e93a SHELL32!kfapi::CFolderDefinitionStorage::Load+0x46
12 0128ec78 766b7f5e SHELL32!kfapi::CFolderDefinitionCache::Load+0x75
13 0128ed30 7668de3c SHELL32!kfapi::CFolderPathResolver::GetPath+0x8b
14 0128edc4 766c4f36 SHELL32!kfapi::CFolderCache::GetPath+0xb0
15 0128ee28 766c4e98 SHELL32!kfapi::CKFFacade::GetFolderPath+0x5d
16 0128ee48 7665cc17 SHELL32!SHGetKnownFolderPath_Internal+0x38
17 0128ee60 70698084 SHELL32!SHGetKnownFolderPath+0x2a
18 0128eea8 706968d9 fdeploy!CFileCacher::Init+0x4f
19 0128f388 706939bb fdeploy!CPolicyComputant::GetRedirectionInfo+0x26f
1a 0128f3d4 70694a5f fdeploy!CEngine::ProcessGroupPolicyEx+0x146
1b 0128f420 742ade7c fdeploy!ProcessGroupPolicyEx+0x46
1c 0128f60c 742ad5ca gpsvc!ProcessGPOList+0x4d0
1d 0128f850 742a6037 gpsvc!ProcessGPOs+0x1d25
1e 0128f958 742a596c gpsvc!ApplyGroupPolicy+0x538
1f 0128f9b0 742b5d14 gpsvc!CGroupPolicySession::ApplyGroupPolicyForPrincipal+0x322
20 0128f9e4 7769d0e9 gpsvc!CGroupPolicySession::ApplyGroupPolicyThread+0x32
21 0128f9f0 777716c3 kernel32!BaseThreadInitThunk+0xe
22 0128fa30 77771696 ntdll!__RtlUserThreadStart+0x23
23 0128fa48 00000000 ntdll!_RtlUserThreadStart+0x1b

We have root-caused this issue and we are waiting for a hotfix on this. Note that happens both on WinVista and Win7. As soon as the hotfix is committed to, I will update this post with the KB number.