Posts How to use and monitor Azure Update Management
Post
Cancel

How to use and monitor Azure Update Management

Azure Spring Clean

Welcome to the first annual Azure Spring Clean. It is time for clearing away unused Azure resources, accounts and RBAC roles. Mowing the grass of reports and recommendations from Azure Advisor. Re-seeding your subscription with new or improved Azure Policies and Blue Prints. Reviewing your billing and resource usage. Dusting off your dashboards, reports and existing documentation of your resources and subscriptions.

If you are not familiar with the Azure Spring Clean, this is a great initiative by Joe Carlyle and Thomas Thornton. Every week we get 5 new articles written by the community. For the full list of articles please visit the website AzureSpringClean.com. To find out more about the initiative and why spring clean before the actual spring, check out the episode of the Azure Late Show Podcast.

Today is day 14 and we are going to talk about the Azure Update Management, how to configure it to deliver updates to your Windows and Linux, on-premises and cloud VMs, and how to monitor it.

What is Azure Update Management

Taking care of servers in the on-premises environment is something that we all do. But this is not always the case with VMs that we have in Azure. Provisioning any number of Linux and Windows Virtual Machines in Azure is easy. And because of that, and because of the OpEx payment model, those VMs usually don’t have a long life expectancy. But there are still workloads and legacy type applications that are not built to work that way and we have to run them in Azure VMs full time.

We have to maintain the systems we have and part of that means that we have to patch it. No matter what kind of tools we use, patch-management means the following:

  • Identify machines that we need to patch
  • Identify and approve the updates and features that we want to install
  • Set the maintenance window
  • Deploy the patches, reboot if needed
  • Check the results
  • Repeat on schedule

For Azure VMs that are part of our existing network, we can use the tools that we already have, such as WSUS or SCCM. But if that is a separate environment, or we want to manage things from the familiar face of Azure Portal, we can use a feature that is already part of our Azure Subscription and that is Azure Update Management.

Azure Update Management allows us to assess and update the Windows and Linux systems. From there we can schedule deployments and orchestrate the installation of updates within a predefined maintenance window. We can create server groups and approve specific updates to different sets of machines. Service can be used with Azure and non-Azure Virtual Machines that are running in on-premises or in other public or private clouds.

To achieve all that, Azure Update Management is using Azure Automation and it’s components. Computer status is collected from the Microsoft Monitoring Agent (MMA). Updates are delivered to Windows systems by using Windows Updates and Linux via PowerShell Desired State Configuration (DSC). Automation Hybrid Runbook Worker is needed for the execution of Azure Automation Runbooks to update on-premises machines. All logs and information about the updates are stored in Azure Log Analytics.

Azure Update Management can also be used in combination with System Center Configuration Management. That opens up additional options, such as installing third-party updates.

The solution is available for no additional cost. But some chargers are being generated. We are paying for log data stored in the Azure Log Analytics and an additional $6 for each non-Azure node. Outgoing traffic for delivering updates to our on-premises VMs is generating some additional costs as well.

Setup

Requirements

Before we can start creating and using it, we need to check-out some pre-requisites.

Automation Account & Log Analytics

First, we need an Automation Account and Azure Log Analytics. And they have to be in the right regions. Check out the workspace mappings docs page for the updated list of region mappings.

Supported Operating Systems are Windows Server 2008-2019, and different versions of Linux, including CentOs 6 and 7, Ubuntu 14.04 and later, SUSE 11 and 12, and Red Hat Enterprise 6 and 7. If you are running VM in Azure chances are that it is supported. Older Server OSes and Client OSes are not supported.

Integration with SCCM will require installing additional management packs.

Your VMs also need to communicate with Microsoft Azure and Windows Update Management service. Communication occurs over port 443 and the following URLs need to be allowed:

  • *.ods.opinsights.azure.com
  • *.oms.opinsights.azure.com
  • *.blob.core.windows.net
  • *.azure-automation.net

Configuration

The solution needs to be enabled before we can start adding the VMs. We can do that in Azure Portal from Azure Virtual Machines page, from individual Azure VM page, or the Azure Automation Account. We can also configure all pre-requisites and add Virtual Machines directly from Azure connected Windows Admin Center.

In this guide, I am going to show the configuration from Automation Account.

If we go under our Automation Account and click on the Update management, we can see all the pre-requisites from there.

Resources Automation Account & Log Analytics

Resources Automation Account & Log Analytics

Select your Log Analytics workspace and matching Automation Account, and click Enable. A couple of minutes later the solution is enabled and we can start adding the VMs.

Registering Azure VMs

Azure Update Management Main Page

From here we can start adding Virtual Machines.

Click on Add Azure VMs, select virtual machines that you want to add to the solution, and click Enable. Easy.

Add Azure VMs

It might take some time before machines start showing up.

Registering Non-Azure VMs

Registering Non-Azure VMs requires the installation of a Microsoft Monitoring Agent and connecting it with your Log Analytics workspace.

Open Advanced settings under your Log Analytics workspace, and from there select Windows or Linux servers. There you can find the agent installation files, Workspace ID, and your Primary and Secondary Keys.

Download, install and register using the Workspace ID and keys.

Add Azure VMs

Now you just have to add your VMs to Azure as you did before.

Scheduling the Update Deployment

After a package is released, it takes 2 to 3 hours for the patch to show up for Linux machines for assessment. For Windows machines, it takes 12 to 15 hours for the patch to show up for assessment after it’s been released.

Compliance scan runs every 12 hours by default, but that can be forced by restarting the MMA agent on VM.

Once there are VMs added and assessed, we can start deploying the missing updates. For that, we have to create and schedule Update Deployment.

Name

Each Update Deployment should have a unique and descriptive name.

OS Type

We can choose if we want Windows or Linux, but they can’t be mixed.

Included VMs

Then we can decide if we want to select a group of VMs or specific VMs from the list. Groups can be dynamic, and they can be based on different parameters, such as Tags, Resource Groups, Region, VM Size, OS type, etc.

Update Classification

We can also select what update classifications we want to install, and if we want to include only specific updates or exclude some and install everything else.

Update Schedule

Updates can be scheduled to run once or on a recurring schedule. We can select the date and time, and how often and when to repeat it.

Pre-scripts + Post-scripts

This option lets you run PowerShell runbooks in your Azure Automation account before and after the update deployment. Those scripts are executed in the Azure context and not locally on the machines. This is great if you run multi-tier applications and we want to make sure that workloads are stopped or started in the right order. I personally use it to properly shutdown the applications and to copy some temporary data before the updates.

Maintenance window

This number represents the maximum length during which the updates must be completed. The last 20 minutes of the specified maintenance window is dedicated to starting everything back up so the machines can be up and running before the window expired.

Reboot options

Here we have 4 options to choose from, always, never, if needed, or only reboot without installing the updates.

Credentials

It is also important to understand under which context and credentials are those updates delivered. We can have VMs that are under different domains or not domain-joined at all. We need to make sure that Automation Account can access those VMs, whether that is under Azure Run-As account or different credentials or certificates. Those need to be added to KeyVault or registered under the Credentials section in Automation Account.

Monitoring

Now that we have Azure Update Management service configured and our servers are being updated, we also need to make sure that service is monitored. There are a few things that can go wrong and we need to be aware of that. Azure Monitoring Agent is out of date or not responding. Servers stopped communicating back to LogAnalytics workspace. Our maintenance window was too short and not all the updates were installed on time. Here are the few ways how to achieve that.

View status in Azure Portal

We can always look at the Azure Portal to see how many updates are missing or if there are service issues. The History tab will show the results of previous update deployments.

Error in Portal

Alerts

Azure Automation Account offers alerting mechanism that can send information about each update deployment run and also if/when something goes wrong.

To configure the alert, go to Monitoring, Alerts and click New Alert Rule. Create an alert condition and group of people to receive notifications.

I like to create alerts for when the Update Deployment fails, or when the maintenance window was too short to complete installation of all updates.

New Alert Rule

Logs

Azure LogAnalytics gives us very powerful logging features and in addition to the information shown in Azure Update Management, we can also search all the logs and pull out the information we need.

Search result like this can also be displayed on your Azure Dashboard, or parsed by the Runbook from Automation Account to create custom alerts and notifications.

To access logs, go to the Monitoring section and then Logs. From there you can use pre-defined search queries, or type your own.

Logs

Here are a couple of examples that I like to use:

Computer Summary

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Heartbeat
| where TimeGenerated>ago(12h) and OSType=~"Windows" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId
| join kind=leftouter
(
    Update
    | where TimeGenerated>ago(14h) and OSType!="Linux"
    | summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Approved, Optional, Classification) by SourceComputerId, UpdateID
    | distinct SourceComputerId, Classification, UpdateState, Approved, Optional
    | summarize WorstMissingUpdateSeverity=max(iff(UpdateState=~"Needed" and (Optional==false or Classification has "Critical" or Classification has "Security") and Approved!=false, iff(Classification has "Critical", 4, iff(Classification has "Security", 2, 1)), 0)) by SourceComputerId
)
on SourceComputerId
| extend WorstMissingUpdateSeverity=coalesce(WorstMissingUpdateSeverity, -1)
| summarize computersBySeverity=count() by WorstMissingUpdateSeverity
| union (Heartbeat
| where TimeGenerated>ago(12h) and OSType=="Linux" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId
| join kind=leftouter
(
    Update
    | where TimeGenerated>ago(5h) and OSType=="Linux"
    | summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Classification) by SourceComputerId, Product, ProductArch
    | distinct SourceComputerId, Classification, UpdateState
    | summarize WorstMissingUpdateSeverity=max(iff(UpdateState=~"Needed", iff(Classification has "Critical", 4, iff(Classification has "Security", 2, 1)), 0)) by SourceComputerId
)
on SourceComputerId
| extend WorstMissingUpdateSeverity=coalesce(WorstMissingUpdateSeverity, -1)
| summarize computersBySeverity=count() by WorstMissingUpdateSeverity)
| summarize assessedComputersCount=sumif(computersBySeverity, WorstMissingUpdateSeverity>-1), notAssessedComputersCount=sumif(computersBySeverity, WorstMissingUpdateSeverity==-1), computersNeedCriticalUpdatesCount=sumif(computersBySeverity, WorstMissingUpdateSeverity==4), computersNeedSecurityUpdatesCount=sumif(computersBySeverity, WorstMissingUpdateSeverity==2), computersNeedOtherUpdatesCount=sumif(computersBySeverity, WorstMissingUpdateSeverity==1), upToDateComputersCount=sumif(computersBySeverity, WorstMissingUpdateSeverity==0)
| summarize assessedComputersCount=sum(assessedComputersCount), computersNeedCriticalUpdatesCount=sum(computersNeedCriticalUpdatesCount),  computersNeedSecurityUpdatesCount=sum(computersNeedSecurityUpdatesCount), computersNeedOtherUpdatesCount=sum(computersNeedOtherUpdatesCount), upToDateComputersCount=sum(upToDateComputersCount), notAssessedComputersCount=sum(notAssessedComputersCount)
| extend allComputersCount=assessedComputersCount+notAssessedComputersCount

Missing updates summary

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Update
| where TimeGenerated>ago(5h) and OSType=="Linux" and SourceComputerId in ((Heartbeat
| where TimeGenerated>ago(12h) and OSType=="Linux" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId))
| summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Classification) by Computer, SourceComputerId, Product, ProductArch
| where UpdateState=~"Needed"
| summarize by Product, ProductArch, Classification
| union (Update
| where TimeGenerated>ago(14h) and OSType!="Linux" and (Optional==false or Classification has "Critical" or Classification has "Security") and SourceComputerId in ((Heartbeat
| where TimeGenerated>ago(12h) and OSType=~"Windows" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId))
| summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Classification, Approved) by Computer, SourceComputerId, UpdateID
| where UpdateState=~"Needed" and Approved!=false
| summarize by UpdateID, Classification )
| summarize allUpdatesCount=count(), criticalUpdatesCount=countif(Classification has "Critical"), securityUpdatesCount=countif(Classification has "Security"), otherUpdatesCount=countif(Classification !has "Critical" and Classification !has "Security")

Missing updates list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Update
| where TimeGenerated>ago(5h) and OSType=="Linux" and SourceComputerId in ((Heartbeat
| where TimeGenerated>ago(12h) and OSType=="Linux" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId))
| summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Classification, BulletinUrl, BulletinID) by SourceComputerId, Product, ProductArch
| where UpdateState=~"Needed"
| project-away UpdateState, TimeGenerated
| summarize computersCount=dcount(SourceComputerId, 2), ClassificationWeight=max(iff(Classification has "Critical", 4, iff(Classification has "Security", 2, 1))) by id=strcat(Product, "_", ProductArch), displayName=Product, productArch=ProductArch, classification=Classification, InformationId=BulletinID, InformationUrl=tostring(split(BulletinUrl, ";", 0)[0]), osType=1
| union(Update
| where TimeGenerated>ago(14h) and OSType!="Linux" and (Optional==false or Classification has "Critical" or Classification has "Security") and SourceComputerId in ((Heartbeat
| where TimeGenerated>ago(12h) and OSType=~"Windows" and notempty(Computer)
| summarize arg_max(TimeGenerated, Solutions) by SourceComputerId
| where Solutions has "updates"
| distinct SourceComputerId))
| summarize hint.strategy=partitioned arg_max(TimeGenerated, UpdateState, Classification, Title, KBID, PublishedDate, Approved) by Computer, SourceComputerId, UpdateID
| where UpdateState=~"Needed" and Approved!=false
| project-away UpdateState, Approved, TimeGenerated
| summarize computersCount=dcount(SourceComputerId, 2), displayName=any(Title), publishedDate=min(PublishedDate), ClassificationWeight=max(iff(Classification has "Critical", 4, iff(Classification has "Security", 2, 1))) by id=strcat(UpdateID, "_", KBID), classification=Classification, InformationId=strcat("KB", KBID), InformationUrl=iff(isnotempty(KBID), strcat("https://support.microsoft.com/kb/", KBID), ""), osType=2)
| sort by ClassificationWeight desc, computersCount desc, displayName asc
| extend informationLink=(iff(isnotempty(InformationId) and isnotempty(InformationUrl), toobject(strcat('{ "uri": "', InformationUrl, '", "text": "', InformationId, '", "target": "blank" }')), toobject('')))
| project-away ClassificationWeight, InformationId, InformationUrl

Summary

I hope you enjoyed this quick look at the Azure Update Management. I like how this solution fits nicely with cloud-only and hybrid environments and it is not trying to replace something that we already use on-premises.

Vukašin Terzić

Updated Mar 10, 2020 2020-03-10T22:20:10+01:00
This post is licensed under CC BY 4.0