How we do it: Manage and exceed security update compliance


Hello, my name is Naveen Kumar Akkugari and I am part of the service engineering group within the Enterprise Mobility & Management (EMM) group here at Microsoft. This blog is part of a series of posts that aims to answer, "How does Microsoft do it" and today I am going to discuss how we are doing security patches to secure our devices.

The purpose of this story is to share how we manage and exceed security update compliance ongoing basis using System Center Configuration Manager. Our goal is to go deeper how we are doing deployment of security updates to secure our managed PCs at Microsoft. Aside from a general overview of our configuration, one of the main points we want to make here is how we've achieved that tricky balance between security and end-user experience, using our silent patching process for the first seven days. We are optimistic that most organizations can use our model, and increase their patch success. Another key aspect to consider, which isn't otherwise covered here, is the importance of staying current with the latest version of Configuration Manager to consistently get the latest in update management technologies (like Express support) and general, iterative quality improvements.

 Best Practices for security update management.

Based on our experience, I would like to share some of the best practices which can help in your environment.

  • To keep your environment secure and reliable, make sure to have well defined proactive process for software update deployment and set the target for software update compliance.
  • Establish a process to monitor environment for any non-compliant machines and take necessary actions to install all required updated regularly.
  • As software deployment requires lot of efforts, try to automate as much as possible to reduce the time for deploying updates and human errors.
  • Go through the patch release notes and make sure to take any required additional actions apart from installing the security update.
  • When you have more than one software update point at a primary site, use the same WSUS database for each software update point. As each time change in WSUS server will trigger the full scan, try to avoid switching WSUS server urls and database.
  • Better idea to create new software update group for each month to prevent the deployment from crossing the limit of software updates per deployment.
  • By storing your updates from previous years to a static software update groups will help to manage updates easier and reduce the load on servers and clients.
  • Perform the cleanup of WSUS DB regularly and periodically remove expired updates and contents from deployments to keep the software update metadata smaller to increase the server’s performance and reduce the scan time.
  • Periodically review your ADR settings and products /categories.
  • Use validation groups to help ensure that security updates do not negatively affect the business.
  • Make sure the users are aware about the importance of securing the environment, and communicate to users every month about patch Tuesday.
  • Create proper reports to confirm the successful deployment of patches and verify that there is no negative impact.

 Infrastructure design

Before we get things started for security update deployment and compliance, below is brief overview in figure 1 of our current architecture of System Center Configuration Manager (ConfigMgr). We currently manage over ~360,000 desktops and around 250,000 users with the unique aspects of our environment. We manage all global desktops through the below architecture which has 6 primary sites (recently we migrated one primary site to Azure), and 13 secondary sites spread around the globe along with 93 distribution points. There is one dedicated primary site to assign all enrolled mobile devices and those devices managed through our Microsoft Intune using ConfigMgr/Intune connector. At Microsoft, we use ConfigMgr for various services offering which includes software/cumulative updates, antimalware definition updates, compliance settings, application deployment, operating system deployment and mobile device management. Our current approach for all deployments and changes is central administration, which means all global deployment and changes are executed from Central Administration Site (CAS).

 

 

Figure 1: System Center Configuration Manager Architecture

 As mentioned earlier, our goal is to provide a consistent user experience and ensure all computers connected to Microsoft network have the latest definition updates and most recent security updates. Our priority, like many companies, is to keep the Microsoft environment secure. One aspect which makes Microsoft somewhat unique, is the variety of products and services we are dogfooding all year around. In this process of dogfooding, we build 1000s of machines daily and install unreleased application along with unreleased operating systems. As this is a unique and consistently changing environment, we set 95% as compliance target to reach in Nine business days for any critical security updates and three business days for any active exploit.

 Software Update Delivery Design

To reduce manual efforts and expedite the monthly software update deployment process, we use pre-configured Automatic Deployment Rule (ADR) to automatically download and deploy software updates that are released by Microsoft for the current month.

To learn about creating an ADR, please review the content from the link below: https://blogs.technet.microsoft.com/configmgrdogs/2012/05/07/configmgr-2012-automatic-deployment-rules/

Our focus is on security-updates only, but when needed we will also include other updates. As a picture is worth a thousand words, here are some screenshots to share how we have configured updates in our production environment.

 

Beyond this at Microsoft, we have many different business groups doing various development and management activities. Because of this, we have slightly different service offerings in how we present ConfigMgr to interface with them. Based on our environment and business requirements we have multiple service offering (SILO) to install monthly security updates.

  1. Full Service is our primary service and is for all machines are in supported scope. To this group, we deploy all mandatory applications and security updates automatically. No end user's inputs required in most of the cases.
  2. On Demand Patching service is for groups who need maximum control and flexibility in managing their systems. This group would not want us to push anything to their system that would install automatically with or without their knowledge. We will only publish monthly Security Updates to the systems, and the group owners kick off the installation of security updates when they are ready.
  3. Maintenance Window Service is for groups who don't require dogfood software deployed by us, but would like to leverage our team's help in deployment of Security Updates & Endpoint Definition Updates. The only requirement from them is that any installation happens within pre-configured maintenance windows.

Software Update Deployment Process

          Environment Pre- validation

To have seamless experience with patch deployment even though we have monitoring for infrastructure, we perform few pre-validation tasks to make sure infrastructure is healthy. Generally, we start these tasks 24 hours before patch Tuesday, so that we have some time to remediate any issues identified.

  • As we have 100s of servers, as a first step we make sure all servers are online and healthy condition. If we see any issues with servers, then we take the necessary action.
  • Validate the ConfigMgr site status (all sites status should be "Active") by checking in console to make sure no issues with sites.

Administration--> Overview -->  Site Configuration --> Sites (Then check the status for all servers)

  • Validate all roles and components to make sure all are healthy and performing normal. There should not be any abnormal resource usage on servers.
  • Cross check for any on-going major deployments in hierarchy, as a best practice we prefer to avoid any major deployments.

Monthly Software Update deployment process:

When the Automatic Deployment Rules runs on every 2nd Tuesday of the month, it will automatically create the Update Group, Deployment and update the package with newly published security updates. Once security updates are published, we use internal automation to create all three different deployments. During this process our automation will validate last month deployment settings to make sure reboot setting is disabled. If the last month deployment is set to active (reboot enforced) then it will suppress the reboot to avoid multiple reboots.

To release the deployment quicker, we use the setting to download the updated from Microsoft update source if the software updates content is not available in a distribution point (Figure 2 below). By doing this we are releasing the software update deployment within ~3 hours after security updates release. When we set the deployment to show the notifications, notifications will be displayed on a periodic basis until all pending mandatory software update install. By default, notifications display every 4 hours for deadlines more than 24 hours away, every hour for deadlines less than 24 hours away, and every 15 minutes for deadlines that are less than 1 hour away, and restart countdown is 120 minutes.

 

Figure 2: Software Update deployment setting

 

Deployment Phases

Silent patching is our primary way to deploy updates to machines and in this phase, we deploy updates silently in background without enforcing system reboots to assure a good user experience on full service machines and set the deadline two hours away from the deployment time to expedite the update installation. After update install is complete, the machine goes into a "pending reboot" state and we completely rely on natural reboots which can be initiated by users or executed according to maintenance window settings and we get around 70% of the machines updated and rebooted.

 

After deploying updates silently for six days we convert the deployment to interactive so that users can get notifications for pending patches.  In this phase, we convert existing deployment to display all notifications with the enforced reboot to have a deadline of two days (our normal reboot deadline is on third Tuesday 11:59 PM client local time). To expedite patch deployments, we also go beyond ConfigMgr and approve updates in Windows Server Updates Services (WSUS) so that machines can install latest updates from either ConfigMgr or WSUS.

Out of Band Release/update process

 It is occasionally necessary to get a software update out more quickly, such as a zero-day exploit or an active virus in the network. This happens rarely but we have a process in place in preparation for things like this. To get the software updates quicker for any active exploit we deployed updates set to displaying all notifications with the installation deadline.

In this case as our goal is to deploy the updates ASAP to secure our network. To get the updates deploy quickly, we use interactive method so that users can get notifications for pending patches.  In this phase, we display all notifications with the enforced reboot to have a deadline of ~24 hours (our normal reboot deadline is on next day 6:00 PM client local time). To expedite patch deployments, we also go beyond ConfigMgr and approve updates in Windows Server Updates Services (WSUS) on second day 7:00 PM PST so that machines can install updates from either ConfigMgr or WSUS.

During this the desktops starts to download the updates as soon as they receive the polices and showing the notifications to users with the software updated availability. Users can take actions based on their availability. Notifications are presented on a periodic basis until all pending mandatory software update installations have completed. By default, they display every 4 hours for deadlines more than 24 hours away, every hour for deadlines less than 24 hours away, and every 15 minutes for deadlines that are less than 1 hour away, and restart countdown is 120 minutes.  This can cause huge burden ConfigMgr server infrastructure, so planning infrastructure capacity is key of meeting compliance for zero-day situations.

Remediation Process:

To ensure we have a high rate of compliance we take some additional steps to see how effective our strategy has been and to quickly identify and address any failures or issues. To meet the compliance goal, we closely monitor the Software update deployment and start the remediation activities after 24 hours of Software update deployment. During the remediation activity, our first focus is to remediate any issues on machines which are active but not performing WSUS scan. This gives us a quick insight on how our Software update efforts are going and alerts us if additional actions are necessary.

Below is the graph shows the trend of our patch compliance reach

 

 

We hope this blog has helped in understanding how we manage and exceed security update compliance ongoing basis using System Center Configuration Manager. Our key to success in achieving that key balance of security and end-user experience, is fundamentally built on our first-week of silent updates. Stay tuned for next blog on how we measure SLA and different Power BI dashboards for our Software update compliance including proactive alerting and monitoring on key dependencies.

 

Disclaimer: This may not be explicitly supported and this blog post is for informational purposes. This post is provided “AS IS” with no warranties and confers no rights.

 


Comments (9)

  1. Bryan Dam says:

    First, thanks for this post … very interesting stuff. It raises so many questions though.

    You only focus on and release security updates? That violates Microsoft’s own best practices. Every Premier ticket I’ve opened starts with the question “Have you applied all released patches for product X”. Is this only part of MS’s patching strategy and there is some other mechanism used to push non-security updates? For instance, how is MS deploying the preview updates released on the third Tuesday of each month?

    You only show one ADR. Most organizations split their deployments in some fashion (ex. each OS or workstation/server) to limit the amount of policy the clients have to deal with. MS apparently … doesn’t worry about that? You just deploy updates for every product listed there to 360,000 desktops? That sounds certain to make WMI cry.

    I’ve never seen such a large list of products selected. I’d would love to know more about your WSUS setup that handles that. Most organization’s WSUS infrastructure would be crushed by that kind of load. The recent ‘100% CPU and RAM Usage’ issue and blog post for example.

    Speaking of WSUS: “we also go beyond ConfigMgr and approve updates in [WSUS]”. Can you clarify on what instance of WSUS you’re talking about? Are you approving and downloading patches on an upstream WSUS instance or the instance that your SUPs run on? A lot of organizations would love to do the later but doing so violates ConfigMgr’s support statements/policy.

    You don’t show any title filters. Are you deploying both the ‘Security Only’ (Monthly) and ‘Security and Quality'(Cumulative) rollups to your devices and letting the clients install them in whatever execution order they chose? If not, how are you picking between the two?

    Similarly, are you deploying Office 365 updates? If so, are you deploying all versions (ex. 1701 vs 1705) of the Semi-Annual channel (formerly Deferred Channel) and letting the clients install them in whatever execution order they chose? If not, how are you picking between them?

    When exactly on Patch Tuesday are you running your sync and ADR? More and more, Office 365 and Windows 10 patches are being released one, sometimes even two, days after Patch Tuesday. In Redmond’s timezone. Running a monthly ADR on Patch Tuesday grabbing just the last 8 hours is going to miss those.

    Again, thanks for this post and sorry for being so inquisitive about it.
    Bryan

  2. Marty Lichtel says:

    Thanks for this write up. I am confused about your use of WSUS to approve updates. Are you doing this on the SUP role? I did not think it was supported to manage WSUS like a standalone installation if it was integrated into ConfigMgr.

  3. Pavel Yurenev says:

    This ADR seems to create a new group for ALL updates released in last 8h: https://msdnshared.blob.core.windows.net/media/2017/10/53.png – and download all them to a single Update Package.

    It should create a HUGE package, which should be distributed to all sites/DPs in this case, and the larger is the package the harder it to manage. Also for some reason many unsupported products are listed: this won’t give you unneeded updates, but will provoke more load on SQL when running ADR.

    BTW, how SUP and ADR sync schedules are defined?

  4. Vladislav Alyushin says:

    Hello. I have one small question – you synchronize WSUS, SCCM in work hours?

  5. ShawnD1 says:

    Thank you for the great post. Had one question: What are you using to report compliance? Doesn’t look to be a canned report.

  6. Pavan says:

    “Once security updates are published, we use internal automation to create all three different deployments. During this process our automation will validate last month deployment settings to make sure reboot setting is disabled. If the last month deployment is set to active (reboot enforced) then it will suppress the reboot to avoid multiple reboots”

    Can you explain what kind of automation you are using in this phase ? can we do the same out side MS ?

    Also can we do a pre-reboot for patching?

  7. Alex says:

    Can you please elaborate on this statement?: “To expedite patch deployments, we also go beyond ConfigMgr and approve updates in Windows Server Updates Services (WSUS) on second day 7:00 PM PST so that machines can install updates from either ConfigMgr or WSUS.”. I was under the impression that when using ConfigMgr we were not supposed to touch the WSUS server at all.

  8. Ilia says:

    Hello. Good job. Very interesting. Can you show the source of report, which you use for show progress your installation (Compliance Trend)?

  9. Nigel Wadsworth says:

    Great article, thanks. How do you generate the “Compliance Reach Trend” chart?

Skip to main content