Windows Enterprise Client Boot and Logon Optimization – Part 17, Wait Analysis – More Challenging Example


This post continues the series that started here.

In my last post I introduced Wait Analysis, showing a simple example. In this post, I’ll describe the process of following waits from one thread to another.

Referring to the diagram from last time –

Threads

I’d start by finding thread T1 waiting, identify that it’s waiting for thread T2, identify that thread T2 is waiting on thread T3 and so on.

Wait Analysis Example

In this example, users were experiencing slow logons to a Remote Desktop Server during the Winlogon phase.

As I did in the last post, I open the trace in Windows Performance Analyzer (WPA), select and zoom to the time of interest* and then examine the Generic Events table.

* Being a Remote Desktop Server, this scenario is actually a little more difficult to analyse because you can’t conduct a full boot trace and you don’t see boot phases using the FullBoot.Boot.Regions definition. Instead, we choose Performance Scenario General in Windows Performance Recorder and then use the FastStartup.Regions definition. We can observe individual Winlogon phases for each user.

image

Here I see a wait of around 30 seconds during RequestCredentials. This aligns with winlogon.exe (5332), ThreadID 5336.

As I did in my last post, I now examine this process and thread using the CPU Usage (Precise) table, order columns and expand the thread stack –

image

After ordering by the Waits (us) sum column, I see that most of the time thread 5336 waited took place in 8 separate events but the longest wait was around 30.6 seconds.

Furthermore, I see that winlogon.exe (5332), ThreadID 5336 waited on winlogon.exe (5332), ThreadID 5348. This new process/thread is where I want to look next. Repeating the same steps in the CPU Usage (Precise) table shows –

image

Note that because the thread stack is very deep, I’m saving space by trimming it in the graphic which is what the green area represents.

Now I see that winlogon.exe (5332), ThreadID 5348 waited 30.6s for LogonUI.exe (5384), ThreadID 5472 and again, this was a single wait … I’m heading in the right direction.

Again, I repeat the investigation, examining LogonUI.exe (5384), ThreadID 5472. This wait chain goes on for a few more threads. To save you from repetition, what I observe is –

LogonUI.exe (5384), thread 5472 waits on LogonUI.exe (5384), thread 5388
LogonUI.exe (5384), thread 5388 waits on LogonUI.exe (5384), thread 5464

Finally, examining LogonUI.exe (5384), thread 5464 yields a clue –

image

CtxWinlogonProv.dll is a Citrix binary and after some research, a Citrix support article was found –

CTX133873 - Slow Logons in XenApp Sites with Read-Only Domain Controllers

A Word of Warning

Wait Analysis won’t always yield obvious results. Sometimes, instead of a single event with a long wait, you’ll have thousands of events all with tiny waits. Understanding why that same wait is occurring over and over again may not be clear.

Look for clues in the function names you see in the call stacks and look for third party modules that may explain your issues.

If you get this far and you’re still searching for answers, my suggestion would be that you log a support case with Microsoft. Use the keyword SBSL and it should be directed to the right team for investigation.

Conclusion

Parts 5 to Part 17 of this post series have covered the troubleshooting approach you might take for boot and logon performance issues that creep into production.

I’ve discussed boot phases, the activities that occur during and across those phases, the potential issues in each phase and the troubleshooting tools you can use to help identify those issues.

Next Up

Infrastructure and Settings – Group Policy

Comments (5)

  1. anonymouscommenter says:

    My peer Mark Renoden, Roger Southgate and Scott Duffey, whom I had the pleasure of meeting in Sydney

  2. Dora says:

    Hello,do you have some auto method to analyze wait.I foud that microsoft Assessment Toolkit have WaitAnalysisConfig.xml,But i don't know how to achieve it.

  3. Dora says:

    HI,In the actual environment, the boot waiting for analysis is very difficult 。For example,How to find the real waiting chain that let me very confused. By waiting time to start the analysis is not necessarily correct。

    1. Mark Renoden says:

      Hi Dora

      I haven't found the automatic wait analysis particularly helpful in my own work. For this reason, I didn't include it in this series. Remember that you always have to start with some context. You're looking for a long boot phase that is not explained by high CPU or Disk I/O. Once you have a context, zoom to that period of time and then use Generic Events to identify a process and thread with a large gap between events. This is the approach that I've found most successful and I'll admit there are many times where I don't get a clear idea of the cause.

      1. Dora says:

        Hi Mark Renoden
        I know you meanings.But I need to analyze hundreds of boot etl. If all of this analyze by Manual may cost lots of times.So I think
        Automatic analysis more efficiency.So I foud this website:
        https://msdn.microsoft.com/zh-cn/windows/hardware/commercialize/test/wpt/optimizing-performance-and-responsiveness-exercise-1
        So,do you know this? And some suggest about analyze hundreds of boot etl.

Skip to main content