SharePoint Server 2016 Zero-Downtime Patching demystified


Disclaimer: this article only covers single aspects of the upgrade process. For a complete list of upgrade approaches including all details please check the following articles:

 

Patching SharePoint servers has always been a challenge. A major culprit for administrators is the fact that installing SharePoint fixes will cause downtime to the SharePoint farm. For SharePoint Server 2016 it was announced that we will have zero-downtime patching. This article intends to outline the reasons for downtimes with previous versions of SharePoint and the changes implemented in SharePoint Server 2016 to allow to perform zero-downtime patching including its constraints.

Historically this downtime can occur in two different steps:

  1. Installing the binaries
  2. Running the SharePoint configuration wizard.

Installing the binaries

While installing the actual binaries it is required to stop/restart SharePoint specific windows services and IIS websites to ensure that the updated files (e.g. dlls) are loaded.

The server specific downtime usually starts with running the hotfix executable till the hotfix installation is done.

With the huge hotfix files we are used to from SP2013 this can take several hours. Most of this time accounts for the extraction of the exe and cab file and then applying the various installer msp files which are included in the hotfix package.

Russ Maxwell has created a script which helps to significantly reduce the amount of time required to install the actual fixes but you will still experience a downtime on the specific server during the installation of the fix.

To overcome this issue we recommended to have high availability per role in your farm (at least two servers per role) to ensure that you can patch one server while the other one is online and continues to serve the requests. That means during patching you can remove one server from load balancing, patch it and add it back to load balancing. Then repeat the same steps for the other server(s).

While the patches are applied using this steps your farm will have servers with two different patch levels but that is fine as SharePoint will run in backward compatible mode if the database schema is older than the patch level of the servers in the farm.

So using redundant servers, you can achieve zero-downtime during the hotfix installation even with SharePoint 2013.

After the patch has been installed on all server you need to run the SharePoint configuration wizard.

Running the SharePoint configuration wizard

The SharePoint configuration wizard needs to be run on all machines in the farm to complete the hotfix installation.

You can use either the command line version (psconfig.exe) or the UI version (psconfigui.exe). I have discussed benefits and caveats of using the different methods in a separate article.

It is recommended to first run the configuration wizard on one of the app servers in the farm to perform the database upgrade. After this step is completed the configuration wizard can be started on all the other servers in the farm to perform the server specific upgrade operations (copy dlls, install features, apply security settings, …) on each of these servers (again you should remove the servers where psconfig is running from load balancing as services are restarted).

With SharePoint 2013 and before the database upgrade step will cause downtime on all servers in the farm as psconfig is updating databases which are used by all servers in the SharePoint farm. Depending on the specific updates applied to the database stored procedures, views, triggers and constraints could be dropped and recreated and other database content could be updated. SQL Queries issued by the SharePoint servers during this upgrade can fail if (e.g.) a stored procedure was called while the upgrade job was dropping it to replace it with an updated version or a fix contains (e.g.) two different changes to two different stored procedures where the changes depend on each other but only one stored proc was updated when the request comes in.

These limitations could also cause failed upgrades or excessive slowdowns in the upgrade process because of resource contention and locking. For these reasons accessing SharePoint content while the database is upgraded is unsupported and untested.

A partial workaround for this exists in SharePoint 2013 for updating the content database: After installing the patch and before running the SharePoint configuration wizard customers can run

Upgrade-SPContentDatabase -UseSnapshot …

 

Upgrade-SPContentDatabase can perform the same upgrade for a content database as the SharePoint configuration wizard. The benefit of using this Cmdlet is that you can run it in parallel in several powershell windows against different content databases. That can help to significantly reduce the database upgrade time after installing a hotfix as it allows to upgrade several content databases in parallel. The SharePoint configuration wizard would upgrade the content databases sequentially one after the other.

Another benefit is the -UseSnapshot parameter listed above.

This parameter will create a snapshot of the specified database and then perform all upgrade operations that apply to the database. Existing connections to the content database from the different SharePoint servers in the farm will be set to use the snapshot for the duration of the upgrade, and then switched back after successful completion of upgrade. Be aware that this parameter can only be used with versions of SQL Server that support creation and use of snapshots, for example, SQL Server Enterprise edition.

As all regular SharePoint operations are executed against the snapshot while the content database is upgraded the problems listed above will not occur. The caveat here is that the snapshot is read-only. That means although it prevents the above listed problems related to dropping stored procedures while they are in use they will prevent all write operations to the SQL database.

After upgrading the SharePoint content databases you still have to run the SharePoint configuration wizard to upgrade all the other SharePoint databases – and this can also cause downtime to services accessing these databases during the upgrade.

So it is not possible to achieve zero-downtime for read-write operations during the hotfix installation with SharePoint 2013.

Changes in SharePoint Server 2016

With SharePoint Server 2016 the amount of MSP files has been significantly reduced. This will ensure that the time to patch the servers will be much shorter. Aside of that you still have to ensure that you have more than one server per role (high availability) to guarantee zero-downtime patching as the windows services have to be restarted during patching. MinRole is not required for this! All the details on how to technically apply the patches can be found in the following Technet article:

The biggest improvements over SP2013 are on the upgrade side. Several improvements have been made to ensure that upgrading the SharePoint databases does not lead to a downtime for the end user. These changes include changes on how the changes are applied but also restricting what changes are allowed to be done in a hotfix request. E.g. all stored procedures have to be backward compatible to ensure that if one stored procedure is updated with a hotfix it can still be called from older stored procedures which are updated through a later step in the update cycle. We also update the stored procedures without dropping them to prevent outages if a stored proc is called while it is in the limbo state between drop and recreation.

These changes are long tested in SharePoint online where upgrades are performed for all SharePoint online servers every couple of weeks while the service is live without any read-only windows for customers.

These improvements apply as well to content databases and to other SharePoint databases that need to be upgraded. You can still use Upgrade-SPContentDatabase to speed up the database upgrade step but using the -SnapShot parameter will not bring a benefit when trying to minimize the downtime. It can actually be counter-productive as it leads to a read-only content database.

With the improvements implemented and using redundant servers for each role it is possible to achieve zero-downtime during the hotfix installation with SharePoint Server 2016.

Comments (17)

  1. Hi Stefan,

    Super! Finally this feature has been clearly explained. Many thanks for that.

    Cheers,
    Marco

  2. Andy says:

    Thank you for the great insights!

    But do you still have to do all steps listed in the TechNet article “Install a software update for SharePoint Server 2016” for servers with search components?
    It was updated end of April but contains “Osearch15” instead of “Osearch16″…

    I would have guessed with all the improvements in SharePoint 2016 there would also be some improvements regarding search?

    Thank you
    Andy

    1. That is still accurate. The version number is most likely a copy&paste error.

  3. Andy says:

    Oh and a second question:

    The TechNet article “Software updates overview for SharePoint Server 2016” is from February and list 2 types of “Software update strategies”:
    1. Install the update and do not postpone the upgrade phase.
    2. Install the update and postpone the upgrade phase.
    (see: https://technet.microsoft.com/en-us/library/ff806329%28v=office.16%29.aspx#updatestrategy)

    I mean if you have redundant servers why don’t you just upgrade the first app server and then each front-end (while taking out of load balancing)?

    If you can access content from databases as an end user while upgrading databases why should i defer the upgrade phase in a redundant environment?

    Thank you again,
    Andy

    1. Hi Andy,
      there is no technical requirement to postpone but some customers prefer to do it that way as they are used to do it that way with 2013.
      It is still possible to postpone it. It is the decision of the customer which method he prefers.
      Cheers,
      Stefan

  4. Hi Stefan,

    in regards to the Zero-Downtime Patching, you knwoe there are some limitations for the Distributed Cache Service. The regarding English Technet site states it correct in the “MinRole Server Deplpoyments” section (https://technet.microsoft.com/en-us/library/mt346114(v=office.16).aspx). However, the German, Russian, Spanish (from my tests) still reference a quorum role for the Distributed Cache Service.

    From my experience and different Technet articles (e.g. https://technet.microsoft.com/en-us/library/jj219572.aspx and https://technet.microsoft.com/en-us/library/jj219613.aspx?f=255&MSPPError=-2147217396) the statement with a nine server MinRole depoyment can not be right. Eight should be right with the mentioned Distributed Cache limitations in the English article. Could you please shed some light on this (or report internally to fix it)?

    Thank you

  5. Hi Stefan,

    Is Zero downtime patching supported on ‘Single server farm’ mode? If the DB upgrade doesnot cause any outage, is Zero downtime patching is recommended for single server farm scenarios?

    Thanks,
    Karthik

    1. Hi Karthik,
      zero downtime patching requires high-availability of each server role.
      A single server farm cannot have high availability. Ergo no zero downtime patching on single server farm.
      Cheers,
      Stefan

  6. Redjesh Behari says:

    Hi Stefan, Great article thanks for that… I have one question though what does Microsoft mean with high-available? is it that every role should have a minimum of two severs or three? Because high available in a Skype For busniness Environments means at least three servers for the app fabric. So I am a bit confused for the SharePoint part. Do we need two servers for each role or three? I hope you can shed some light on this matter

    Thnx in advanced,

    Rgds,

    Red

    1. Hi Redjesh,
      yes – at least two servers per role is required.
      Cheers,
      Stefan

  7. David says:

    Hi Stefan,
    I still confuse…Assume that I have TWO WFEs/APPs (load balancing) in SharePoint 2013, so does this structure support zero downtime patching ?
    Many thanks,
    David

    1. Hi David,
      SharePoint 2013 does NOT support zero downtime patching.
      Zero downtime patching can only be achieved with SharePoint Server 2016.
      Cheers,
      Stefan

  8. Dan Rosenberg says:

    Hi Stefan, Your posts are much appreciated. I have a question regarding the “Language Dependent” second update file: e.g From the TechNet page “2.Run the wssloc2016-kb2920690-fullfile-x64-glb.exe file (that is, wssmui.msp).” On the there is a “Note” saying: “You may need to extract the wssmui.msp file for each language installed on the farm”.
    My question is, do I need to only extract and run only the languages relevant to my farm at this point?

    1. Hi Dan,
      you should not have to extract anything.
      Can you send me the link to the technet article where you read this?
      Thanks,
      Stefan

      1. Bob Dillon says:

        Hi Stefan, I was also a little confused about the language statement in this article:

        https://technet.microsoft.com/en-us/library/ff806338(v=office.16).aspx

        The article states:

        To install the update

        Run the sts2016-kb3115088-fullfile-x64-glb.exe file (that is, sts.msp).
        Run the wssloc2016-kb2920690-fullfile-x64-glb.exe file (that is, wssmui.msp).
        Note:
        You may need to extract the wssmui.msp file for each language installed on the farm.

        The word language is never mentioned again. Is the article trying to state that we need wssmui.msp for each additional language pack we have installed?

        Thanks!

        1. Hi Bob,
          this paragraph confuses me as well 😉
          Not sure what the writer tried to say here.
          You only have to install this exe once – thats all. It will automatically apply the fix to all languages installed on your system.
          Cheers,
          Stefan

          1. Bob Dillon says:

            Thanks for the clarification!

Skip to main content