I was working with one of my customers this week to upgrade their environment to SCCM 2012 R2 SP1. The environment we were working in is relatively simple; single site primary with 3 child secondary sites. Primary site server upgrade went fine. And then we turned to the secondary site upgrades. I have executed several secondary site upgrades in my lab, customer labs, etc and never experienced an issue whatsoever. So I guess I was about due for an issue to pop up while executing a secondary site upgrade. And there you have the reason for this blog… to share my experience with everyone so you don’t have to go through what I did to correct the issue.
First, I won’t be sharing how I fixed the issue. I just can’t, as fixing the issue requires altering the SCCM database and you should NEVER do that without being under the guidance of Microsoft Support. If you experience a similar issue as outlined below, you should immediately call Microsoft Support for assistance. However, I will share my experience with you and how we ensured this did not happen on the remaining secondary sites we had to upgrade.
The method to upgrade a secondary site that is documented on TechNet is to open the Configuration Manager console and go to Administration > Sites, right-click the secondary site you wish to upgrade and select Upgrade from the context menu. This action initiates SCCM to do a pre-requisite check on the to-be upgraded secondary site. There are a couple of options for viewing the pre-requisite check progress:
In the console, in the ribbon bar, there is a Show Install Status button that will record the pre-requisite check results that you can continually refresh to follow progress
You can view the ConfigMgrPrereq.log at C:\ on the primary site server
Once the pre-requisite check completes, Hierarchy Manager (Hman) wakes up to begin the secondary site upgrade. One of the first actions that occurs in the database is the secondary site information is updated to reflect the current version and build number for the site for R2 SP1, which is 5.00.8239.1000 and 8239 respectively. Once this occurs, the Upgrade option for the secondary site is then greyed out in the console and you cannot attempt another upgrade (if the secondary site upgrade fails, this is the reason why you need to contact Microsoft Support so the upgrade option can be opened back up). Hman will also check the site control file info in the database and if all checks out, will create a job to create a compressed package of the secondary site upgrade/install media for the Sender component to send to the secondary site. All of which can be seen in the Hman.log and Sender.log on the primary site server.
Once the compressed package is successfully sent to the secondary site, SMS_BOOTSTRAP installs itself on the secondary site server to begin the secondary site upgrade. Once the SMS_BOOTSTRAP installs and begins the upgrade process, you’ll have 2 logs on the secondary site at C:\ that you can review, SMS_BOOTSTRAP.log and ConfigMgrSetup.log. A few things will happen with testing the connection to the SQL database on the secondary site and then a call is made to stop the site services, SMS_EXECUTIVE (SMS_Exec) and SMS_SITE_COMPONENT_MANAGER (SiteComp). Before SiteComp shuts down, it is responsible for shutting down the threads of SMS_Exec before the SMS_Exec service is stopped completely. After SMS_Exec service is stopped, SiteComp is stopped, then the upgrade occurs.
My customer and I began the upgrade on the first secondary site. We followed along as outlined above. Once we saw the send job complete to the secondary site, we logged on to the secondary site and opened up the logs to watch the progress. It didn’t take long before we saw the following error in the ConfigMgrSetup.log:
<10-22-2015 16:19:56> Failed to create process of SetupWpf.exe.
Looking at the lines above in the log for context, we can see why the process failed to create:
INFO: Configuration Manager shutting down services: Calling StopServices()
INFO: Notifying Site Component Manager of site shutdown…
SMS Site Component Manager cannot stop component SMS_SITE_SYSTEM_STATUS_SUMMARIZER on component server SERVER.DOMAIN.COM. The component is probably performing clean up tasks which could take a while, clicking Ignore will terminate the component and anything that it's doing.
Server components are experiencing fatal errors.
Why did it fail? SiteComp was unable to stop a particular thread of SMS_Exec. In this case, it could not stop SMS_SITE_SYSTEM_STATUS_SUMMARIZER.
Now, I have seen similar logging like that for primary site upgrades and rather than seeing an “Abort”, you will see an “Ignore” and the upgrade of the primary site continues. But, for whatever reason, for a secondary site upgrade, if it cannot stop the threads of SMS_Exec, it just chokes on itself and the upgrade fails. So where do you go from here? Unfortunately, it’s not as simple as going back in the console and clicking “Upgrade” again due to the mention earlier regarding the “Upgrade” option being greyed out in the console. If you are at this point, you need to call Microsoft Support for help.
How to Avoid:
For my customer, I did some research on similar cases and I was shocked to see case, after case, after case, for secondary site upgrade failures for the “Failed to create process of SetupWpf.exe” error all due to SiteComp not being able to stop the services successfully. So, start your secondary site server upgrade from the console and quickly log into the secondary site and shutdown SMS_EXECUTIVE and SMS_SITE_COMPONENT_MANAGER services manually or kill the processes associated with the services; smsexec.exe and sitecomp.exe respectively while the pre-req check is occurring before the bootstrap starts the upgrade on the secondary site server. On another note, also ensure you disable any AV on the primary and secondary site servers prior to executing the upgrade process as a precautionary measure.
The Rest of the Story:
For my customer, we did alter the database to open the upgrade option back up in the console. We also ensured we manually stopped the SMS_EXECUTIVE and SMS_SITE_COMPONENT_MANAGER services. We then attempted the upgrade again. Well, it failed again. Following the trail of the upgrade process, we saw the pre-requisite check was successful. We then moved on the Hman and saw in the Hman.log that Hman could not upgrade the secondary site because it could not find the site GUID site definition property in the site control file info in the database. Oh boy right? Long story short, more database modification were needed, you can guess where, and after attempting the upgrade again, it was successful! The crazy part is that all of this was necessary just because SiteComp could not stop a thread of SMS_Exec on the secondary site server.
For the remaining secondary sites, we started the secondary site upgrade from the console, then quickly logged into the secondary site server and stopped the SMS_EXECUTIVE and SMS_SITE_COMPONENT_MANAGER services manually and the remaining secondary sites upgraded with no issue whatsoever.
If you want to avoid a lot of headache, frustration, and a call to Microsoft Support, after you start your secondary site upgrade from the console, quickly log into your secondary site server and stop the SMS_EXECUTIVE and SMS_SITE_COMPONENT_MANAGER services manually.