Optimization Guidance for Windows 7

The Microsoft SBSL (Slow Boot Slow Logon) Community continuously works on improving performance during system startup and logon. This can be anything from hardware to software issues, this blog is a summary of a fair amount of those recommendations. Do note, this list is far from complete and it doesn’t have to reflect Microsoft’s best practices.

 

The idea

  1. Boot and logon duration is primarily a function of workload. Everything you add to base OS adds workload. Bigger workloads mean bigger delays. The more policies, scripts, and data you move, the more services you start, the longer end-users wait.
  2. Operations that take a long time after logon will affect boot and logon performance in an exaggerated way when resource consumption and concurrency conflicts exists. Restated, boot and logon is somewhat scenario specific. Inefficiencies that impact your post logon performance will crush your boot and logon performance.
  3. Cost justify everything you do. Does the benefit of added workloads during boot and logon cost justify themselves against the added productivity cost. When calculating the total costs of added workload, think about: 
    - The number of boot and logons per day
    - The amount of business days per year
    - Average employee tenure in years
    - Employee cost per minute  
    - The minutes the additional workload contributes to boot and logon delay

 

Hardware

During system startup Windows needs to read a lot of data from the local storage device. Depending on whether this is a clean Windows installation or an image containing provisioned applications, this can be anywhere between 150mb and 500mb (or possibly more), most of it random being random I/O.

Rotational hard drives have high seek times due to the need to travel the arm to the track of the disk where the data will be read or written. This makes them relatively bad in random I/O.
Solid State Drives (SSD) are similar to the memory you put in your camera, no physical movement is needed to read or write data. Have a look at the random I/O speed difference on a my local system containing two drives, one SSD for the operating system and one Hard Disk Drive (HDD) for bulk data storage.

image

Figure 1 – Winsat disk results on a SSD drive

image

Figure 2 – Winsat disk results on a HDD drive

In the figures above you can see that the SSD is about 150 times faster compared with the HDD in doing small random reads.
The main takeaway is that fast storage is crucial for quick startup and logon performance, in most scenarios much more important than the CPU. When considering the purchase of new hardware, I’d rather have a fast SSD with a slower CPU than visa versa.

Having enough physical memory for the current operating system, applications and future service packs is also crucial. Due to the low prices of physical memory this bottleneck which is getting rare on todays Windows Clients.

Some other settings that can impact performance on hardware level are:

  • Drive Acoustic modes, this limits the noise a HDD makes but impacts the performance.
  • Power Saving modes, can reduce the speed of disk I/O and potentially CPU I/O.

 

OS Servicing

On existing Windows 7 (and Windows 2008 R2) computers, consider and test the installation of all known fixes for known SBLS delays to avoid troubleshooting known issues and potentially improve boot and logon performance. This boils down to the installation of KB2775511 as mentioned in my other blog post, this rollup contains 90 fixes. Make sure this goes through a proper test cycle before deploying in production. 

Also consider the installation of KB2792026 - Windows 7 SP1-based or Windows Server 2008 R2 SP1-based SMBv2 client computer freezes when the computer is under a heavy load.

Also integrate the same fixes into your OS build process for Windows 7 SP1 and Windows 2008 R2 Server images.

 

Base OS

  • A much seen issue is big WMI repositories, caused by applications that incorrectly treat repository as long term store. Have a look at %SystemRoot%\System32\WBEM\Repository and the size of the objects.data file.
  • Registry hives (system, security, software) must be of reasonable size, some applications are known for sources of registry bloat.

Application

  • Make sure you run the latest version, service packs and fixes for all security software.
  • Consider defining antivirus exclusions like described in KB822158.
  • Monitor the way applications use the registry+WMI.
  • Create a baseline using the Windows Performance Toolkit and note performance impact after comparing this with a trace after adding application or changing settings to a new image.
  • For each application that you install, examine known boot, logon and post logon performance issues and memory/handle or other resource leaks and their mitigation with the application vendor.

Logon Scripts

  • Use fully qualified DNS names to remove any ambiguity remote target computers.
  • When using the “Net use /D” command, it’s recommended to add the /Y parameter.
  • Scripts must be fault tolerant/resilient around password mismatches and inaccessible target computers.
  • Logon delays caused by script logic failures are frequently hidden by tools or when configured to run minimized or hidden mode.
  • Decrease the default value for “Specify maximum wait time for Computer Configuration” in Administrative Templates -> System-> Scripts Group Policy scripts (which changes the registry setting:  HKLM\software\Microsoft\Windows\CurrentVersion\Policies\System\MaxGPOScriptWait) from the default 10 minutes to reasonable execution time for the set of scripts being executed over the links at your company)
  • Use Group Policy or Group Policy Preferences where applicable instead of KIX or other scripts.
  • Reduce the amount of WMI queries inside of a script.

Locator

  • Sites and subnets must be defined subnets containing by member workstations, servers and domain controllers.
  • Locator on clients and servers must pick the in-site or next closest DC for authentication, sourcing policy, profiles, logon scripts and LDAP/NSPI queries during boot, logon and post logon operations. The use of remote site servers adds WAN latency, reduces packet size, increases potential for port blockage and packet fragmentation.
  • Windows 2003 Domain Controllers need the Read Only Domain Controller Compatibility Pack installed if RODCs are deployed to prevent W2K3 DCs from auto-site covering for RODC-covered sites

Network

  • To get optimal performance in redirection and protocol stacks, servers consulted during boot and logon should be of the same or newer OS version as the clients you are deploying. For example, if deploying Windows 7 on the clients, servers should be Windows 2008 R2 or later. If deploying Windows 8, servers should be Windows Server 2012 or later.
  • Hard code link speeds in NCPA on clients and servers should match speed of underlying network.
  • Hard code link speeds on routers must match speed of underlying network.
  • Network connectivity must be allowed over all well-known and ephemeral ports used by the Operating System versions and applications considered during boot and logon. Low and high range ephemeral ports must be open for Windows Server operating systems deployed on the network. Windows 2003 and earlier servers use the low port range. Windows 2008 and later use the “high” port range.
  • Any delay in fundamental operations like SMB dialect negotiation, SMB session setup and Tree Connect during boot, logon and post logon operations is very likely to reduce performance.
  • Test network throughput in a post logon command prompt or Explorer instance for common operations performed during boot and logon. Note size and time it takes to copy policy, profiles, logon scripts.
  • Spanning tree enabled switches should have PORTFAST enabled.
  • EnablePMTUDiscovery = 0 reduces packet size to >512 bytes. Override with a policy and resolve the real problem that required you to set this key in the 1st place.
  • Assume that Network infrastructure devices, especially WAN accelerators block required ports, reduce packet size, decelerate network throughput and connect you to sub-optimal target until proven otherwise at recurring intervals. 

Policy

  • Reduce the amount of policies.
  • Reduce the amount of WMI filters on policies.
  • Make a concerted effort to reduce the amount of synchronous policies
  • Already mentioned at scripts: Decrease the default value for “Specify maximum wait time for Computer Configuration” in Administrative Templates -> System-> Scripts Group Policy scripts (which changes the registry setting:  
    HKLM\software\Microsoft\Windows\CurrentVersion\Policies\System\MaxGPOScriptWait) from the default 10 minutes to reasonable execution time for the set of scripts being executed over the links at your company)
  • Item level targeting fixes have to be in place on clients subject to item level targeting to improve logon performance and decrease CPU load caused by excessive LDAP traffic on DC role computers and network I/O.
  • Track the execution time, network I/O and DC load as you make changes to your computer and user policy. E.g. Inefficiently configured item level targeting can cause long delays on the client, massive increases in network I/O and high CPU load on the domain controllers servicing the resulting security group or LDAP requests.

Profile

  • Default and roaming user profiles must be of reasonable size.
  • Default and roaming profile must be modified in a supported way.
  • Investigate workarounds for Active Setup if using mandatory roaming profiles.
  • The link to the server hosting the profiles should be a fast and low latency connection.

Services

  • Auto-Start only the services that need to be started during the boot and logon phase.
  • Use the Windows Performance Toolkit to track any service requiring a high startup time, generally speaking 0.5 seconds or higher.

Hope this information helped you, if you have any comments or questions please consider leaving a reply.