Matthew Robben here, I’m a Program Manager on the Windows Server Performance team and my primary responsibility is Windows Server power management. Server power efficiency is a topic of considerable importance – in today’s difficult economy, IT organizations need to contain and reduce costs. Yet the cost of energy to power and cool a 1U server is now more than the amortized cost of the server (over 3 years).
Energy efficient hardware and software reduces operational costs and directly impacts an organization’s bottom line. We’re in the midst of developing Windows Server 2008 R2, and one of our goals for the product is to build a server operating system that is more power efficient than all of our previous releases. Furthermore, to help IT administrators better understand server power management and optimize their current Windows Server 2008 installations, we’re releasing a comprehensive white paper called “Power In, Dollars Out: Reducing the Flows in the Data Center” today. The white paper gives detailed explanations of many factors affecting server power efficiency, and contains a list of best practices for optimization.
One of the stated best practices is to properly configure Windows Server 2008’s power management features. According to the Green Grid, just turning on PPM features in the operating system can reduce power consumption by 20%. In Windows Server, this can be done simply by choosing the Balanced or Power Saver power policies (found in the Power Options applet in the Control Panel). Of course, PPM is a complicated technology, with many more toggles than a simple on/off switch. We’ve done quite a bit of work on the Windows Server processor power management (PPM) algorithms and parameters during R2 development. One of the results of this work was the development of a set of parameters that can boost power efficiency by up to 10% on standard benchmark workloads.
Good news – you don’t need to wait until R2 to deploy these new parameters on your servers. This blog post will describe PPM technology, explain the parameters involved, and show benchmark test results for the parameter changes on a commodity server. It will also give you a handy command-line walkthrough of the powercfg.exe commands necessary to implement these changes in your environment.
First, some context. Power management requires cooperation from the hardware and the operating system to work efficiently. For example, hardware might support low power states, but the operating system schedules computational work and is in the best position to decide when low power states can be leveraged. The Advanced Configuration and Power Interface (ACPI) defines an interface between the operating system and server hardware to be used for power management purposes.
The processor has traditionally consumed the most power in a server, which makes it a great candidate for power-efficiency optimizations. To add detail and flexibility for processor power management, ACPI defines a few sets of states for processors. Performance states, or P-states, are one such state that can be leveraged to increase power efficiency.
Processors can transition between multiple performance states, or P-states. P-states define incremental levels of processor performance, from P0 (most performant) to Pn (least performant). The ACPI specification does not specify a maximum number of P-states, so Pn is used to refer to the highest numbered, lowest performant P-state that a processor supports.
Each successively higher numbered P-state consumes less power than the previous P-state. Processors can dynamically switch between these states during operation to provide only as much computational capacity as is necessary, which saves power during periods of low usage.
Figure 1 below shows a hypothetical set of six P-states that would be available to a processor. Note that the maximum P-state (P0) has the highest frequency, while successively higher numbered P-states reduce in frequency. In this case, the minimum P-state is P5, so the terms Pn and P5 would be interchangeable.
Tuning P-State Parameters for Increased Power Efficiency
Windows Server contains a number of configurable P-state parameters. These can be used to finely tune the power/performance balance of Windows Server PPM. The defaults for these parameters are tuned to deliver excellent power efficiency for most systems and workloads out of the box. However, these are “safe” defaults. They balance performance and power efficiency. Default settings are shown in Table 1. Note that “P-state increase” in this context refers to a transition to a lower numbered, more performant P-state, whereas “P-state decrease” refers to a transition to a higher numbered, less performant P-state. Looking back to Figure 1, an increase would mean moving upward in the chart while a decrease would mean moving downward.
The time interval at which the operating system considers a change of the current P-state.
The minimum time period that must expire before considering a P-state increase.
The minimum time period that must expire before considering a P-state decrease.
The utilization percentage1 that the CPU must exceed to increase P-state.
The utilization percentage that the CPU must be below to decrease P-state
Determines how the kernel power manager accumulates idle time. Settings:
0 (On): idle time is accumulated only when all processors in an idle state domain2 are idle.
1 (Off): idle time is accumulated and P-states are calculated for each processor without regard to any other processor in the domain.
Determines how P-state transition decisions are made. Settings:
IDEAL (0): calculates the target P-state based only on processor utilization and then finds a nearby available P-state on the system.
SINGLE (1): calculates an ideal P-state but only increases or decreases by one P-state per time check interval.
ROCKET (2): transitions to the highest P-state available on increase or lowest P-state available on decrease
1The utilization percentage referenced here is not the same as the CPU usage counter in the Task Manager tool. Without going into more details, this setting is best optimized through empirical experimentation.
2A “state domain” is a dependency between different processor cores or packages on a server. Often, processor designs require that if one core is at a particular performance or idle state, the other cores or packages in the domain must also be at the same state. The hardware notifies the operating system of this dependency by establishing a domain through the ACPI interface.
During Windows Server 2008 R2 development, our team determined a set of parameters that can boost energy efficiency with a very minor performance cost. Notice in Table 1 that the decrease time default is larger than the increase time default. This setting favors P-state increases over decreases. The default increase and decrease percentage settings of 30 and 50 percent, the default domain accounting policy, and the increase and decrease policy defaults favor P-state increases as well.
To tune the machine for more aggressive power savings, we suggest reducing the decrease time to 100 ms to match the increase time, changing the increase and decrease policies to favor P-state decrease, and switching the domain accounting policy to 0 (off). We left the increase and decrease percentages as their defaults to ensure that the system PPM parameters were not completely biased toward power savings and to reduce negative performance consequences. Table 2 summarizes these changes.
Important: Modifying any of these parameters changes the behavior of performance state handling from the out-of-box experience. Before you deploy to production servers, validate the effects of any changes in a test environment.
Domain Accounting Policy
These parameters can only be set using the powercfg.exe command-line tool, which is installed by default to the Windows\System32 folder on Windows Server 2008. The commands to change the P-state settings by using powercfg.exe are given at the end of this post.
To test the efficiency of these new power settings (henceforth called “Aggressive” settings), we performed a set of benchmark runs on a four-socket quad-core server. Table 3 gives the system configuration.
Table 3. Four-Socket Quad-Core Server Configuration
4 quad-core 2.9-GHz
32 4-GB DDR2 667-MHz DIMMs
4 72-GB, 15,000 SCSI
We ran the SPECPower benchmark with both the default settings and the Aggressive power saving settings. Figure 2 and Figure 3 show the power usage and power efficiency across different workload levels. The Aggressive settings exhibit significant power efficiency over the default settings at a majority of the load levels. The maximum power saving is achieved at 60-percent workload level on this configuration with approximately 10-percent improvement in power efficiency when it is compared to the default setting. There is a negligible reduction in overall throughput at utilization levels above 97%.
These settings were tested on commodity servers with the SPECPower workload. Your particular hardware and workload might deliver different results. Please test any parameter changes before deploying in your production environment.
If you decide you want to deploy the new P-state parameter settings in your environment, you’ll first need to verify that your Windows Server 2008 installation is configured to use the Balanced power policy. Verify this by going to Power Options in the Control Panel.
Done? Next, you need to start a command prompt with administrator privileges. Get the binary dataset that represents the current Balanced AC power settings for P-states with the following command line (corrected from earlier versions of this post, thanks to Asmus for the heads up!):
>powercfg /getpossiblevalue sub_processor procperf 2
You should see the following:
This value represents an encoded dataset of power policy parameters. The parameter values for this dataset can be shown with the decode command:
>powercfg /ppmperf /decode 640864000000A0860100E09304001E00000032000000
Verify that your power parameter values match the defaults shown below and in Table 1. If your parameter settings do not match these values, your Windows Server parameters may have already been reconfigured for optimal power efficiency in your environment.
Busy Adjust Threshold: 100
Time Check: 100
Increase Time: 100000
Decrease Time: 300000
Increase Percent: 30
Decrease Percent: 50
Domain Accounting Policy: 0
Increase Policy: 0
Decrease Policy: 1
Next, you need to change the parameter values to match the “Aggressive” settings described in this post. To do so, use the following command:
>powercfg /ppmperf /encode base 640864000000A0860100E09304001E00000032000000 /decreasetime 100000 /domainaccountingpolicy 1 /increasepolicy 1 /decreasepolicy 0
After executing this command, powercfg will print out a binary dataset representing the new values, like the one shown below.
You need to apply the new dataset by using the setpossiblevalue command:
>powercfg /setpossiblevalue /sub_processor /procperf 2 binary 640364000000A0860100A08601001E00000032000000
Finally, use the setactive command to enable the new parameter set. No reboot is necessary for these parameters to take effect.
>powercfg /setactive scheme_balanced
If you want to restore the default setttings, use the setpossiblevalue command with the default dataset value (shown below), and follow it with a setactive command: >powercfg /setpossiblevalue /sub_processor /procperf 2 binary 640864000000A0860100E09304001E00000032000000 >powercfg /setactive scheme_balanced
If you want to restore the default setttings, use the setpossiblevalue command with the default dataset value (shown below), and follow it with a setactive command:
>powercfg /setpossiblevalue /sub_processor /procperf 2 binary 640864000000A0860100E09304001E00000032000000
>powercfg /setactive scheme_balanced
That’s it! You’ve taken your first step to increasing energy efficiency in your datacenter. As our white paper explains, there’s even more you can do. It’s a highly recommended read for cost-sensitive administrators.
Thanks for reading!
Windows Server Performance Team