One of my Server 2012 machines kept hanging, and it was more than annoying.
The system is a Dell Precision 690 Workstation, with dual Xeon CPU’s, and has the Hyper-V role installed. This machine uses an Intel storage controller and has an addition Dell SAS/SATA controller. Primarily, I use this machine as my iSCSI server to host disks for other servers in my lab.
The problem, is that several times an hour, the entire system would hang, for 20-30 seconds. It would always recover. However, all the other servers that depend on this server for an iSCSI connection to storage, would also hang up or throw errors caused by not being able to reach the storage. All the VM’s running on iSCSI disks would also just hang until this self corrected. VERY frustrating for demos.
I read several articles on the web, mostly pertaining to Windows 8. There are all sorts of recommendation such as enabling hot swap options in the BIOS for AHCI controlled disks, changing from the Windows driver for the intel storage controller to the Intel branded RST drivers. Some only experience this with SSD’s installed, and this system does have three of them.
When the hang occurred – you would see the following in the system event log:
Log Name: System
Date: 6/19/2013 4:36:34 PM
Event ID: 129
Task Category: None
Reset to device, \Device\RaidPort0, was issued.
What finally resolved this for me – was changing the power management settings, from Balanced to High Performance. What this actually changed that was critical to this condition was modifying PCI Express > Link State Power Management > Off
Turning this from “Moderate” to “Off” resolved the issue and I no longer get these frequent hangs.
From an article I found that discussed this:
PCI Express has "active-state" power management, which lowers power consumption when the bus is not active (that is, no data is being sent between components or peripherals). On a parallel interface such as PCI, no transitions occur on the interface until data needs to be sent.
In contrast, high-speed serial interfaces such as PCI Express require that the interface be active at all times so that the transmitter and receiver can maintain synchronization. This is accomplished by continuously sending idle characters when there is no data to send. The receiver decodes and discards the idle characters. This process consumes additional power, which impacts battery life on portable and handheld computers.
To address this issue, the PCI Express specification creates two low-power link states and the active-state power management (ASPM) protocol. When the PCI Express link goes idle, the link can transition to one of the two low-power states. These states save power when the link is idle, but require a recovery time to resynchronize the transmitter and receiver when data needs to be transmitted. The longer the recovery time (or latency), the lower the power usage. The most frequent implementation will be the low-power state with the shortest recovery time.
I can assume that the 20-30 second “hang” was a resynchronization process, and turning this to “Off” kept the PCI express bus in synch at all times. Might help you if you run into this.