Author: Mohammed Anas Shaikh
Issue: Get-CsPoolUpgradeReadinessState showing NOT READY or BUSY all the time.
You may come across this issue where the output of Get-CsPoolUpgradeReadinessState command would always show “NOT READY” or “BUSY” and never changes even though the pool is stable, All the routing groups are evenly balanced, There is NO user impact or issue and there seems to be nothing wrong. Still Every time you run the command Get-CsPoolUpgradeReadinessState the output simply shows NOT Ready or BUSY.
Since the Get-CsPoolUpgradeReadinessState never shows a READY Status you could not update/restart these FE servers at all.
It just seems like everything is Normal but for Some Reason the output of the command doesn’t reflect that and it seems very likely that the Command output is inaccurate.
The Get-CsPoolUpgradeReadinessState in the backend checks Certain Performance counters to decide if the Pool is Stable and Ready for Upgrade or NOT. If for some reason it is unable to read these counters than there is a possibility that when you run the Get-CsPoolUpgradeReadinessState command the Output will Not be Accurate.
TO resolve this issue you can use the Below information;
What Information does Get-CsPoolUpgradeReadinessState command look at in order to decide if a pool is ready for Upgrade or not?
There are Certain Performance Counters on every FE server that are by default supposed to be ENABLED and should always stay in the ENABLED state. The Get-CsPoolUpgradeReadinessState command reads these counters in order to decide whether a pool or a FE server in an Upgrade domain is ready for update or Not. If For some reason these Performance counters are Disabled then the Get-CsPoolUpgradeReadinessState command will never be able to reflect the True state of the FE server or Pool.
What Counters are required to be Enabled on the LYNC FE Server in order to make sure Get-CsPoolUpgradeReadinessState returns the Correct state of the Pool?
The Performance Counter Named WRTCESPF is the actual Counter that is required to be Enabled for Get-CsPoolupgradeReadinessstate command to work. If this Counter is disabled on any FE server then Get-CsPoolupgradeReadinessstate will not be able to show the true state of the pool.
How can you find out if the WRTCESPF Counter is enabled and is working?
Method 1 – Using Command Lodctr (Recommended Method)
The Output of the LODCTR command shows all the Performance counters that are enabled on a Particular FE server.
To Find out which Counters are Enabled on any FE server do the following
Open Command Prompt
C:\> LODCTR /Q >Counters.txt
Open the Text File Counters.txt in notepad
This file will list all the counters that are present on the Server
Search for the counter Named WRTCESPF
Make Sure it says Enabled next to this Counter in the Text File (See Example below) If it shows Enabled then it means the required counters are enabled.
Method 2 – Using Performance Monitor
The Counter WRTCESPF corresponds to the “LS:Usrv – Cluster Manager” Counter in Performance Monitor
“LS:Usrv – Cluster Manager” is the Actual Counter that Get-CsPoolUpgradeReadinessState looks for in order to decide the state of the pool.
There are several Counters listed Under LS:Usrv – Cluster Manager (as shown below)
To View the above do the following
On the FE server
Go to Performance Monitor
On the Performance Monitor Window
Click the Green Plus Icon for Add
Select LS:Usrv – Cluster Manager
Then Click the Graph Type Icon
Select Report from the Drop Down Menu
This Will show you all the Counters that are within Cluster Manager. The Get-CsPoolUpgradeReadinessState command reads these Counter Values to Decide whether a FE server is in Ready state for upgrade or Not.
You can confirm this if you Start PowerShell Logging in OCS Logger and run the Get-CsPoolUpgradeReadinessState command.
In the Screen Shot Above for LS:Usrv – Cluster Manager Counters you will Notice some numeric Values on the right, If the counter is working fine these Values will be populated. In addition if the Counter is enabled then you will be able to find it in Performance Monitor. These two things will indicate that the Counter is enabled.
You can also run some commands to get the actual output of the Counter Values, the command Below can be used to find the output of the “Usrv – Number of routing groups for which the current machine is idle secondary replica”
“Get-Counter “LS:USrv – Cluster Manager\USrv – Number of routing groups for which the current machine is idle secondary replica” -ComputerName $fe”
The output of the above command on a FE server that is Ready for Update should be Zero. If you run the command and you don’t get an output etc. then it would again indicate that the counter may be Disabled.
Method 3 – Use a Private PowerShell Script
We have a Script that was used in a customer environment to find out the state of the counters. This is an effective way of accurately telling if the counters are enabled correctly but I am not sure if we can publish this script externally and for that reason I am not including it here. Please feel free to reach out to me if you would like to know more about it.
How To Reload the Counters if they are disabled or not working?
If the Get-CsPoolUpgradeReadinessState Command Output Never Shows that the Pool is READY then it could mean two things.
- The Pool is ACTUALLY Not Ready for Update because Routing groups are still being Balanced in the Background.
- The Pool is Steady and all Routing Groups have been Balanced and all users are working just fine but the state Never changes from Not Ready to Ready EVER.
For point #1 the recommendation is to Just wait long enough to make sure the routing groups are balanced and then eventually the Get-CsPoolUpgradeReadinessState should reflect a Status of READY for the Pool.
For Point #2 – The Pool is Steady and all Routing Groups have been Balanced and all users are working just fine but the state Never changes from Not Ready to Ready EVER.
In this case there is a good possibility that everything in working fine but still the Get-CsPoolUpgradeReadinessState never shows READY even after you have waited a considerably long amount of time even Days. The Cause of this issue could potentially be related to the Performance Counters (WRTCESPF, LS:USrv – Cluster Manager) that are required by the Get-CsPoolUpgradeReadinessState to report the True state of the FE server or Pool.
We Discussed previously in this Document how you can find out if the Performance Counters for (WRTCESPF, LS:USrv – Cluster Manager) are enabled and working.
If you have Confirmed that the Counters are in Fact Disabled then that would probably be the cause of the issue.
However you may come across a scenario where you would find that the (WRTCESPF) Counter is ACTUALLY enabled but still the Get-CsPoolUpgradeReadinessState keeps Showing a NOT READY Status.
In BOTH of these situations follow the steps below:
Step 2: Reload the Performance Counters on Every FE server
We have to reload the performance counters on every FE server that is having an issue.
To reload the counter do the following on Each FE server
Open Command Prompt
Run the command regsvr32.exe /I /n wrtcespf.dll
as shown in the screen shot below from C:\program files\Microsoft Lync Server 2013\Server\Core>
C:\program files\Microsoft Lync Server 2013\Server\Core>regsvr32.exe /i /n wrtcespf.dll
Step 3: Manually Set the Correct Permissions in the Registry so that the RTC Server Local Group has Full Permissions to read these Counter Values.
It is required that the RTC Server Local Group has the proper permissions assigned in order to be able to read these performance counter values. If for some reason the permissions get altered due to a Faulty/corrupt GPO then chances are that even though the Counters are Loaded and enabled correctly the Get-CsPoolUpgradeReadinessState will be unable to read these counters. Hence it is a good idea to manually set these permissions Correctly.
To Set the correct Permission do the following on every FE server after you reload the Counters
On FE Server
Open Registry Editor
Browse to HKLM\system\currentcontrolset\services\wrtcespf\performance\parameters
Right Click Parameters
Select The FE server that you are performing this operation on here “in my example it is LYNCENT01”
Now Enter RTC Server Local Group and Click Check Names as shown below:
On the Permissions for Parameters page click Advanced
Select RTC Server Local Group
In the Drop Down Box next to Apply to: Select This Key only
Select checkbox to Allow Full Control
Repeat the same on all the FE servers that have the issue.
- Recycle RTCSrv service or Restart FE server
Once the Counters have been reloaded and appropriate permissions have been set then recycle the RTCSrv Service on all the FE servers (One server at a Time)
NOTE: Make sure you follow the guidelines when recycling the Windows service. Perform this only on one FE server at a time. Make sure it comes back up and the Primary and secondary’s are balanced before you move to the other server.
The NextHop team wants to thank our team member Mohammed Anas Shaikh in Microsoft CSS for this excellent post. We look forward to featuring more of his content soon!