Posts Deploy Azure Monitor With Terraform
Post
Cancel

Deploy Azure Monitor With Terraform

Deploy Azure Monitor with Terraform

Managing cloud infrastructure and applications can quickly become complex. As your Azure environment grows, it’s crucial to maintain visibility into the health, performance, and security of your resources. You can use many solutions for this, and you can also use built-in tools. Azure Monitor simplifies this challenge by collecting, analyzing, and acting on telemetry from your resources—whether they’re VMs, databases, containers, or web applications.

Think of Azure Monitor as your eyes and ears in Azure. It captures important metrics and logs, making them easily accessible for analysis. With customizable alerts, you can proactively address issues before they impact your users. By automating monitoring through Terraform, you can quickly replicate a consistent monitoring setup across your environments.

In this article, we’ll explore how to deploy Azure Monitor using Terraform.

And you can also check out few more articles in my Terraform series:

  1. Getting Started with Terraform on Azure
  2. Transition from ARM Templates to Terraform with AI
  3. Terraform Configuration Essentials: File Types, State Management, and Provider Selection
  4. Writing Your First Azure Terraform Configuration
  5. Modules in Terraform
  6. Deploy Azure Monitor with Terraform(You are here)
  7. Advanced Terraform Techniques and Best Practices (TBD)
  8. Integrating Terraform with Azure DevOps (TBD)
  9. Terraform Associate Certification Study Guide and Tips (TBD)

Let’s dive in!

Core Components of Azure Monitor

ComponentDescription
Log Analytics WorkspaceA centralized place to store and analyze logs and metrics from Azure and on-premises resources.
Data Collection RuleDefines what data (e.g., event logs, syslogs, metrics) to collect from Azure resources and how to route them.
Action GroupDefines automated actions or notifications (e.g., emails, webhook integrations, automation runbooks) when alerts are triggered.
Application InsightsMonitors application performance, availability, and usage, providing powerful analytics for application troubleshooting.
Azure PolicyEnforces compliance and governance by setting predefined rules and configurations across Azure resources. There are many Azure Policies that can help with enforcing or automatically enabling Azure Monitor for different resource types.

Resource Group

To make this example fully functional, we will be deploying a Resource Group to start with

1
2
3
4
5
# Creates a resource group named "monitoring-rg" in the East US region.
resource "azurerm_resource_group" "monitoring" {
  name     = "rg-monitoring"
  location = "westus"
}

Log Analytics Workspace

The core storage for telemetry and log data with powerful analytical querying capabilities.

1
2
3
4
5
6
7
8
# Creates a Log Analytics workspace with a retention period of 30 days.
resource "azurerm_log_analytics_workspace" "workspace" {
  name                = "monitoring-log-analytics"
  location            = azurerm_resource_group.monitoring.location
  resource_group_name = azurerm_resource_group.monitoring.name
  sku                 = "PerGB2018"
  retention_in_days   = 30
}

When setting up Azure Monitor, it’s important to be mindful of the data you’re collecting. Log Analytics Workspace charges are based on the volume of data ingested, so indiscriminately collecting logs or metrics can quickly become costly. To manage expenses effectively, carefully plan your data collection strategy: only collect data that’s relevant for operational insights or necessary for your alerting scenarios. Regularly review your data collection rules, adjusting them as needed to ensure you’re capturing valuable insights without unnecessary overhead. By doing this, you keep your monitoring effective and

Data Collection Rules

Specifies exactly which types of telemetry data (such as syslogs and event logs) are collected from Azure and non-Azure sources.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Defines a rule for collecting event logs and routing them to Log Analytics.
resource "azurerm_monitor_data_collection_rule" "event_collection" {
  name                = "syslog-event-collection"
  location            = azurerm_resource_group.monitoring.location
  resource_group_name = azurerm_resource_group.monitoring.name

  destinations {
    log_analytics {
      workspace_resource_id = azurerm_log_analytics_workspace.workspace.id
      name                  = "workspace"
    }
  }

  data_flow {
    streams      = ["Microsoft-Event"]
    destinations = ["workspace"]
  }
}

Application Insights

Gives you insight into your application’s behavior, usage patterns, and performance by tracking requests, exceptions, and dependencies.

1
2
3
4
5
6
7
# Sets up Application Insights to monitor web application performance.
resource "azurerm_application_insights" "app_insights" {
  name                = "app-insights"
  location            = azurerm_resource_group.monitoring.location
  resource_group_name = azurerm_resource_group.monitoring.name
  application_type    = "web"
}

Action Group

Defines automated actions or integrations triggered by Azure Monitor alerts, such as invoking Azure Automation runbooks, webhook integrations with third-party applications, or sending notifications.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Creates an action group to trigger email alert and webhook integrations and automation runbooks upon alert conditions.
resource "azurerm_monitor_action_group" "complex_action_group" {
  name                = "monitoring-action-group"
  resource_group_name = azurerm_resource_group.monitoring.name
  short_name          = "monag"

    email_receiver {
        name          = "EmailReceiver"
        email_address = "your-email@example.com"
    }

    webhook_receiver {
        name        = "WebhookIntegration"
        service_uri = "https://your-webhook-endpoint.example.com"
    }
}

Alerts

After our basic setup is completed, and we are collecting desired information, it is time to create some automated alerts.

Azure Health and Planned Maintenance Alerts

First, let’s setup notifications about outages in our selected Azure region, and get notifications about planned maintenance, and service advisories to proactively mitigate service disruptions.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Alerts for Azure Service Health events occurring specifically in westus and globally.
resource "azurerm_monitor_activity_log_alert" "service_health_alert" {
  name                = "AzureServiceHealthAlert"
  resource_group_name = azurerm_resource_group.monitoring.name
  scopes              = ["/subscriptions/<subscription-id>"]

  criteria {
    category = "ServiceHealth"
    service_health {
      locations = ["westus", "global"]
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
# Sets up alerts specifically for planned maintenance notifications in westus and globally.
resource "azurerm_monitor_activity_log_alert" "planned_maintenance_alert" {
  name                = "PlannedMaintenanceAlert"
  resource_group_name = azurerm_resource_group.monitoring.name
  scopes              = ["/subscriptions/<subscription-id>"]

  criteria {
    category = "PlannedMaintenance"
    service_health {
      locations = ["westus", "global"]
    }
  }
}

Resource Alerts and Data Processing Rules

Alert TypeDescriptionScope Examples
Metric-basedAlerts triggered based on thresholds defined against performance metrics like CPU or memory.Virtual Machines, Databases, AKS
Log-basedAlerts triggered based on conditions or patterns detected within collected log data.Log Analytics Workspaces, VMs
Data ProcessingRules that control how telemetry data is collected, processed, and routed within Azure.Subscription, Resource Groups

Metric-based Alert Examples

This section covers the implementation of metric-based and log-based alerts, including a data processing rule to control alert noise.

VM Not Available

An alert that triggers if a VM is unavailable (not responding to heartbeat).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
resource "azurerm_monitor_metric_alert" "vm_not_available" {
  name                = "VMNotAvailableAlert"
  resource_group_name = azurerm_resource_group.monitoring.name
  scopes              = ["/subscriptions/<subscription-id>"]
  severity            = 1
  window_size         = "PT5M"

  criteria {
    metric_namespace = "Microsoft.Compute/virtualMachines"
    metric_name      = "Heartbeat"
    aggregation      = "Count"
    operator         = "LessThan"
    threshold        = 1
  }

  action {
    action_group_id = azurerm_monitor_action_group.complex_action_group.id
  }
}

High CPU Usage Alert (All VMs)

Triggers when CPU usage exceeds 90% for over 5 minutes across all VMs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
resource "azurerm_monitor_metric_alert" "high_cpu_all_vms" {
  name                = "HighCPUAllVMsAlert"
  resource_group_name = azurerm_resource_group.monitoring.name
  scopes              = ["/subscriptions/<subscription-id>"]
  severity            = 2
  window_size         = "PT5M"
  frequency           = "PT1M"

  criteria {
    metric_namespace = "Microsoft.Compute/virtualMachines"
    metric_name      = "Percentage CPU"
    aggregation      = "Average"
    operator         = "GreaterThan"
    threshold        = 90
  }

  action {
    action_group_id = azurerm_monitor_action_group.complex_action_group.id
  }
}

Data Processing Rule to Suppress CPU Alerts for Specific VM

Prevents High CPU alerts for a non-production VM (VM-NonProdApp).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
resource "azurerm_monitor_data_collection_rule" "suppress_cpu_alert_vm" {
  name                = "SuppressCPUAlertNonProdVM"
  location            = azurerm_resource_group.monitoring.location
  resource_group_name = azurerm_resource_group.monitoring.name

  data_flow {
    streams      = ["Microsoft-Perf"]
    destinations = ["workspace"]
  }

  data_sources {
    performance_counter {
      streams            = ["Microsoft-Perf"]
      scheduled_transfer_period = "PT1M"
      sampling_frequency = "PT1M"
      counter_specifiers = ["\\Processor(_Total)\\% Processor Time"]
      name               = "cpuMetrics"
    }
  }
}

resource "azurerm_monitor_data_collection_rule_association" "suppress_cpu_alert_vm_association" {
  name                    = "SuppressCPUAlertVM-NonProdApp"
  target_resource_id      = "/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Compute/virtualMachines/VM-NonProdApp"
  data_collection_rule_id = azurerm_monitor_data_collection_rule.suppress_cpu_alert_vm.id
}

Log-based Alert Examples

Alert if Important Application Process Stops

An alert triggered when a critical process named “YourImportantProcessName” stops.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
resource "azurerm_monitor_scheduled_query_rules_alert" "important_process_stopped" {
  name                = "ImportantProcessStoppedAlert"
  resource_group_name = azurerm_resource_group.monitoring.name
  location            = azurerm_resource_group.monitoring.location
  data_source_id      = azurerm_log_analytics_workspace.workspace.id
  severity            = 1
  frequency           = 5
  time_window         = 5

  query = <<QUERY
Event
| where EventLog == "System" and (EventID == 7036 or EventID == 7034)
| where RenderedDescription has \"YourImportantProcessName\"
QUERY

  trigger {
    operator  = "GreaterThan"
    threshold = 0
  }

  action {
    action_group = [azurerm_monitor_action_group.complex_action_group.id]
  }
}

Multiple Failed Login Attempts

This log-based alert monitors security logs for signs of suspicious activity, such as multiple failed login attempts within a short timeframe on your virtual machines. It’s particularly useful for identifying brute-force attacks or unauthorized access attempts.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
resource "azurerm_monitor_scheduled_query_rules_alert" "multiple_failed_logins" {
  name                = "MultipleFailedLoginAttempts"
  resource_group_name = azurerm_resource_group.monitoring.name
  location            = azurerm_resource_group.monitoring.location
  data_source_id      = azurerm_log_analytics_workspace.workspace.id
  severity            = 1
  frequency           = 5
  time_window         = 5

  query = <<QUERY
  SecurityEvent
  | where EventID == 4625 or EventID == 4625  // Windows failed logins
  | summarize FailedAttempts = count() by Computer, bin(TimeGenerated, 5m)
  | where FailedLoginAttempts > 10
  QUERY

  trigger {
    operator  = "GreaterThan"
    threshold = 0
  }

  action {
    action_group = [azurerm_monitor_action_group.complex_action_group.id]
  }
}

With these foundational steps, you’ve configured basic Azure Monitor capabilities. In the next article in this series, we are going to explore some more advanced Alerts and monitoring strategies.

Azure Spring Clean 2025

This article is part of the Azure Spring Clean initiative, a community-driven event focused on sharing knowledge and best practices for Azure. Check out Azure Spring Clean for more insightful content from the Azure community.

Azure Spring Clean

Thanks for following along! Keep exploring, stay curious, and keep clouding around! 🚀☁️

Vukasin Terzic

Updated Mar 7, 2025 2025-03-07T10:25:58+01:00
This post is licensed under CC BY 4.0