Hello, my name is Naveen Kumar Akkugari and I am part of the service engineering group within the Enterprise Mobility & Management (EMM) group here at Microsoft. This blog is part of a series of posts that aims to answer, "How does Microsoft do it" and today I am going to discuss how we are doing security patches to secure our devices.
The purpose of this story is to share how we manage and exceed security update compliance ongoing basis using System Center Configuration Manager. Our goal is to go deeper how we are doing deployment of security updates to secure our managed PCs at Microsoft. Aside from a general overview of our configuration, one of the main points we want to make here is how we've achieved that tricky balance between security and end-user experience, using our silent patching process for the first seven days. We are optimistic that most organizations can use our model, and increase their patch success. Another key aspect to consider, which isn't otherwise covered here, is the importance of staying current with the latest version of Configuration Manager to consistently get the latest in update management technologies (like Express support) and general, iterative quality improvements.
Best Practices for security update management.
Based on our experience, I would like to share some of the best practices which can help in your environment.
- To keep your environment secure and reliable, make sure to have well defined proactive process for software update deployment and set the target for software update compliance.
- Establish a process to monitor environment for any non-compliant machines and take necessary actions to install all required updated regularly.
- As software deployment requires lot of efforts, try to automate as much as possible to reduce the time for deploying updates and human errors.
- Go through the patch release notes and make sure to take any required additional actions apart from installing the security update.
- When you have more than one software update point at a primary site, use the same WSUS database for each software update point. As each time change in WSUS server will trigger the full scan, try to avoid switching WSUS server urls and database.
- Better idea to create new software update group for each month to prevent the deployment from crossing the limit of software updates per deployment.
- By storing your updates from previous years to a static software update groups will help to manage updates easier and reduce the load on servers and clients.
- Perform the cleanup of WSUS DB regularly and periodically remove expired updates and contents from deployments to keep the software update metadata smaller to increase the serverâ€™s performance and reduce the scan time.
- Periodically review your ADR settings and products /categories.
- Use validation groups to help ensure that security updates do not negatively affect the business.
- Make sure the users are aware about the importance of securing the environment, and communicate to users every month about patch Tuesday.
- Create proper reports to confirm the successful deployment of patches and verify that there is no negative impact.
Before we get things started for security update deployment and compliance, below is brief overview in figure 1 of our current architecture of System Center Configuration Manager (ConfigMgr). We currently manage over ~360,000 desktops and around 250,000 users with the unique aspects of our environment. We manage all global desktops through the below architecture which has 6 primary sites (recently we migrated one primary site to Azure), and 13 secondary sites spread around the globe along with 93 distribution points. There is one dedicated primary site to assign all enrolled mobile devices and those devices managed through our Microsoft Intune using ConfigMgr/Intune connector. At Microsoft, we use ConfigMgr for various services offering which includes software/cumulative updates, antimalware definition updates, compliance settings, application deployment, operating system deployment and mobile device management. Our current approach for all deployments and changes is central administration, which means all global deployment and changes are executed from Central Administration Site (CAS).
Figure 1: System Center Configuration Manager Architecture
As mentioned earlier, our goal is to provide a consistent user experience and ensure all computers connected to Microsoft network have the latest definition updates and most recent security updates. Our priority, like many companies, is to keep the Microsoft environment secure. One aspect which makes Microsoft somewhat unique, is the variety of products and services we are dogfooding all year around. In this process of dogfooding, we build 1000s of machines daily and install unreleased application along with unreleased operating systems. As this is a unique and consistently changing environment, we set 95% as compliance target to reach in Nine business days for any critical security updates and three business days for any active exploit.
Software Update Delivery Design
To reduce manual efforts and expedite the monthly software update deployment process, we use pre-configured Automatic Deployment Rule (ADR) to automatically download and deploy software updates that are released by Microsoft for the current month.
To learn about creating an ADR, please review the content from the link below: https://blogs.technet.microsoft.com/configmgrdogs/2012/05/07/configmgr-2012-automatic-deployment-rules/
Our focus is on security-updates only, but when needed we will also include other updates. As a picture is worth a thousand words, here are some screenshots to share how we have configured updates in our production environment.
Beyond this at Microsoft, we have many different business groups doing various development and management activities. Because of this, we have slightly different service offerings in how we present ConfigMgr to interface with them. Based on our environment and business requirements we have multiple service offering (SILO) to install monthly security updates.
- Full Service is our primary service and is for all machines are in supported scope. To this group, we deploy all mandatory applications and security updates automatically. No end user's inputs required in most of the cases.
- On Demand Patching service is for groups who need maximum control and flexibility in managing their systems. This group would not want us to push anything to their system that would install automatically with or without their knowledge. We will only publish monthly Security Updates to the systems, and the group owners kick off the installation of security updates when they are ready.
- Maintenance Window Service is for groups who don't require dogfood software deployed by us, but would like to leverage our team's help in deployment of Security Updates & Endpoint Definition Updates. The only requirement from them is that any installation happens within pre-configured maintenance windows.
Software Update Deployment Process
Environment Pre- validation
To have seamless experience with patch deployment even though we have monitoring for infrastructure, we perform few pre-validation tasks to make sure infrastructure is healthy. Generally, we start these tasks 24 hours before patch Tuesday, so that we have some time to remediate any issues identified.
- As we have 100s of servers, as a first step we make sure all servers are online and healthy condition. If we see any issues with servers, then we take the necessary action.
- Validate the ConfigMgr site status (all sites status should be "Active") by checking in console to make sure no issues with sites.
Administration--> Overview --> Site Configuration --> Sites (Then check the status for all servers)
- Validate all roles and components to make sure all are healthy and performing normal. There should not be any abnormal resource usage on servers.
- Cross check for any on-going major deployments in hierarchy, as a best practice we prefer to avoid any major deployments.
Monthly Software Update deployment process:
When the Automatic Deployment Rules runs on every 2nd Tuesday of the month, it will automatically create the Update Group, Deployment and update the package with newly published security updates. Once security updates are published, we use internal automation to create all three different deployments. During this process our automation will validate last month deployment settings to make sure reboot setting is disabled. If the last month deployment is set to active (reboot enforced) then it will suppress the reboot to avoid multiple reboots.
To release the deployment quicker, we use the setting to download the updated from Microsoft update source if the software updates content is not available in a distribution point (Figure 2 below). By doing this we are releasing the software update deployment within ~3 hours after security updates release. When we set the deployment to show the notifications, notifications will be displayed on a periodic basis until all pending mandatory software update install. By default, notifications display every 4 hours for deadlines more than 24 hours away, every hour for deadlines less than 24 hours away, and every 15 minutes for deadlines that are less than 1 hour away, and restart countdown is 120 minutes.
Figure 2: Software Update deployment setting
Silent patching is our primary way to deploy updates to machines and in this phase, we deploy updates silently in background without enforcing system reboots to assure a good user experience on full service machines and set the deadline two hours away from the deployment time to expedite the update installation. After update install is complete, the machine goes into a "pending reboot" state and we completely rely on natural reboots which can be initiated by users or executed according to maintenance window settings and we get around 70% of the machines updated and rebooted.
After deploying updates silently for six days we convert the deployment to interactive so that users can get notifications for pending patches. In this phase, we convert existing deployment to display all notifications with the enforced reboot to have a deadline of two days (our normal reboot deadline is on third Tuesday 11:59 PM client local time). To expedite patch deployments, we also go beyond ConfigMgr and approve updates in Windows Server Updates Services (WSUS) so that machines can install latest updates from either ConfigMgr or WSUS.
Out of Band Release/update process
It is occasionally necessary to get a software update out more quickly, such as a zero-day exploit or an active virus in the network. This happens rarely but we have a process in place in preparation for things like this. To get the software updates quicker for any active exploit we deployed updates set to displaying all notifications with the installation deadline.
In this case as our goal is to deploy the updates ASAP to secure our network. To get the updates deploy quickly, we use interactive method so that users can get notifications for pending patches. In this phase, we display all notifications with the enforced reboot to have a deadline of ~24 hours (our normal reboot deadline is on next day 6:00 PM client local time). To expedite patch deployments, we also go beyond ConfigMgr and approve updates in Windows Server Updates Services (WSUS) on second day 7:00 PM PST so that machines can install updates from either ConfigMgr or WSUS.
During this the desktops starts to download the updates as soon as they receive the polices and showing the notifications to users with the software updated availability. Users can take actions based on their availability. Notifications are presented on a periodic basis until all pending mandatory software update installations have completed. By default, they display every 4 hours for deadlines more than 24 hours away, every hour for deadlines less than 24 hours away, and every 15 minutes for deadlines that are less than 1 hour away, and restart countdown is 120 minutes. This can cause huge burden ConfigMgr server infrastructure, so planning infrastructure capacity is key of meeting compliance for zero-day situations.
To ensure we have a high rate of compliance we take some additional steps to see how effective our strategy has been and to quickly identify and address any failures or issues. To meet the compliance goal, we closely monitor the Software update deployment and start the remediation activities after 24 hours of Software update deployment. During the remediation activity, our first focus is to remediate any issues on machines which are active but not performing WSUS scan. This gives us a quick insight on how our Software update efforts are going and alerts us if additional actions are necessary.
Below is the graph shows the trend of our patch compliance reach
We hope this blog has helped in understanding how we manage and exceed security update compliance ongoing basis using System Center Configuration Manager. Our key to success in achieving that key balance of security and end-user experience, is fundamentally built on our first-week of silent updates. Stay tuned for next blog on how we measure SLA and different Power BI dashboards for our Software update compliance including proactive alerting and monitoring on key dependencies.
Disclaimer: This may not be explicitly supported and this blog post is for informational purposes. This post is provided “AS IS” with no warranties and confers no rights.