Managing cloud infrastructure and applications can quickly become complex. As your Azure environment grows, it’s crucial to maintain visibility into the health, performance, and security of your resources. You can use many solutions for this, and you can also use built-in tools. Azure Monitor simplifies this challenge by collecting, analyzing, and acting on telemetry from your resources—whether they’re VMs, databases, containers, or web applications.
Think of Azure Monitor as your eyes and ears in Azure. It captures important metrics and logs, making them easily accessible for analysis. With customizable alerts, you can proactively address issues before they impact your users. By automating monitoring through Terraform, you can quickly replicate a consistent monitoring setup across your environments.
In this article, we’ll explore how to deploy Azure Monitor using Terraform.
And you can also check out few more articles in my Terraform series:
- Getting Started with Terraform on Azure
- Transition from ARM Templates to Terraform with AI
- Terraform Configuration Essentials: File Types, State Management, and Provider Selection
- Writing Your First Azure Terraform Configuration
- Modules in Terraform
- Deploy Azure Monitor with Terraform(You are here)
- Advanced Terraform Techniques and Best Practices (TBD)
- Integrating Terraform with Azure DevOps (TBD)
- Terraform Associate Certification Study Guide and Tips (TBD)
Let’s dive in!
Core Components of Azure Monitor
Component | Description |
---|---|
Log Analytics Workspace | A centralized place to store and analyze logs and metrics from Azure and on-premises resources. |
Data Collection Rule | Defines what data (e.g., event logs, syslogs, metrics) to collect from Azure resources and how to route them. |
Action Group | Defines automated actions or notifications (e.g., emails, webhook integrations, automation runbooks) when alerts are triggered. |
Application Insights | Monitors application performance, availability, and usage, providing powerful analytics for application troubleshooting. |
Azure Policy | Enforces compliance and governance by setting predefined rules and configurations across Azure resources. There are many Azure Policies that can help with enforcing or automatically enabling Azure Monitor for different resource types. |
Resource Group
To make this example fully functional, we will be deploying a Resource Group to start with
1
2
3
4
5
# Creates a resource group named "monitoring-rg" in the East US region.
resource "azurerm_resource_group" "monitoring" {
name = "rg-monitoring"
location = "westus"
}
Log Analytics Workspace
The core storage for telemetry and log data with powerful analytical querying capabilities.
1
2
3
4
5
6
7
8
# Creates a Log Analytics workspace with a retention period of 30 days.
resource "azurerm_log_analytics_workspace" "workspace" {
name = "monitoring-log-analytics"
location = azurerm_resource_group.monitoring.location
resource_group_name = azurerm_resource_group.monitoring.name
sku = "PerGB2018"
retention_in_days = 30
}
When setting up Azure Monitor, it’s important to be mindful of the data you’re collecting. Log Analytics Workspace charges are based on the volume of data ingested, so indiscriminately collecting logs or metrics can quickly become costly. To manage expenses effectively, carefully plan your data collection strategy: only collect data that’s relevant for operational insights or necessary for your alerting scenarios. Regularly review your data collection rules, adjusting them as needed to ensure you’re capturing valuable insights without unnecessary overhead. By doing this, you keep your monitoring effective and
Data Collection Rules
Specifies exactly which types of telemetry data (such as syslogs and event logs) are collected from Azure and non-Azure sources.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Defines a rule for collecting event logs and routing them to Log Analytics.
resource "azurerm_monitor_data_collection_rule" "event_collection" {
name = "syslog-event-collection"
location = azurerm_resource_group.monitoring.location
resource_group_name = azurerm_resource_group.monitoring.name
destinations {
log_analytics {
workspace_resource_id = azurerm_log_analytics_workspace.workspace.id
name = "workspace"
}
}
data_flow {
streams = ["Microsoft-Event"]
destinations = ["workspace"]
}
}
Application Insights
Gives you insight into your application’s behavior, usage patterns, and performance by tracking requests, exceptions, and dependencies.
1
2
3
4
5
6
7
# Sets up Application Insights to monitor web application performance.
resource "azurerm_application_insights" "app_insights" {
name = "app-insights"
location = azurerm_resource_group.monitoring.location
resource_group_name = azurerm_resource_group.monitoring.name
application_type = "web"
}
Action Group
Defines automated actions or integrations triggered by Azure Monitor alerts, such as invoking Azure Automation runbooks, webhook integrations with third-party applications, or sending notifications.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Creates an action group to trigger email alert and webhook integrations and automation runbooks upon alert conditions.
resource "azurerm_monitor_action_group" "complex_action_group" {
name = "monitoring-action-group"
resource_group_name = azurerm_resource_group.monitoring.name
short_name = "monag"
email_receiver {
name = "EmailReceiver"
email_address = "your-email@example.com"
}
webhook_receiver {
name = "WebhookIntegration"
service_uri = "https://your-webhook-endpoint.example.com"
}
}
Alerts
After our basic setup is completed, and we are collecting desired information, it is time to create some automated alerts.
Azure Health and Planned Maintenance Alerts
First, let’s setup notifications about outages in our selected Azure region, and get notifications about planned maintenance, and service advisories to proactively mitigate service disruptions.
1
2
3
4
5
6
7
8
9
10
11
12
13
# Alerts for Azure Service Health events occurring specifically in westus and globally.
resource "azurerm_monitor_activity_log_alert" "service_health_alert" {
name = "AzureServiceHealthAlert"
resource_group_name = azurerm_resource_group.monitoring.name
scopes = ["/subscriptions/<subscription-id>"]
criteria {
category = "ServiceHealth"
service_health {
locations = ["westus", "global"]
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
# Sets up alerts specifically for planned maintenance notifications in westus and globally.
resource "azurerm_monitor_activity_log_alert" "planned_maintenance_alert" {
name = "PlannedMaintenanceAlert"
resource_group_name = azurerm_resource_group.monitoring.name
scopes = ["/subscriptions/<subscription-id>"]
criteria {
category = "PlannedMaintenance"
service_health {
locations = ["westus", "global"]
}
}
}
Resource Alerts and Data Processing Rules
Alert Type | Description | Scope Examples |
---|---|---|
Metric-based | Alerts triggered based on thresholds defined against performance metrics like CPU or memory. | Virtual Machines, Databases, AKS |
Log-based | Alerts triggered based on conditions or patterns detected within collected log data. | Log Analytics Workspaces, VMs |
Data Processing | Rules that control how telemetry data is collected, processed, and routed within Azure. | Subscription, Resource Groups |
Metric-based Alert Examples
This section covers the implementation of metric-based and log-based alerts, including a data processing rule to control alert noise.
VM Not Available
An alert that triggers if a VM is unavailable (not responding to heartbeat).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
resource "azurerm_monitor_metric_alert" "vm_not_available" {
name = "VMNotAvailableAlert"
resource_group_name = azurerm_resource_group.monitoring.name
scopes = ["/subscriptions/<subscription-id>"]
severity = 1
window_size = "PT5M"
criteria {
metric_namespace = "Microsoft.Compute/virtualMachines"
metric_name = "Heartbeat"
aggregation = "Count"
operator = "LessThan"
threshold = 1
}
action {
action_group_id = azurerm_monitor_action_group.complex_action_group.id
}
}
High CPU Usage Alert (All VMs)
Triggers when CPU usage exceeds 90% for over 5 minutes across all VMs.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
resource "azurerm_monitor_metric_alert" "high_cpu_all_vms" {
name = "HighCPUAllVMsAlert"
resource_group_name = azurerm_resource_group.monitoring.name
scopes = ["/subscriptions/<subscription-id>"]
severity = 2
window_size = "PT5M"
frequency = "PT1M"
criteria {
metric_namespace = "Microsoft.Compute/virtualMachines"
metric_name = "Percentage CPU"
aggregation = "Average"
operator = "GreaterThan"
threshold = 90
}
action {
action_group_id = azurerm_monitor_action_group.complex_action_group.id
}
}
Data Processing Rule to Suppress CPU Alerts for Specific VM
Prevents High CPU alerts for a non-production VM (VM-NonProdApp).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
resource "azurerm_monitor_data_collection_rule" "suppress_cpu_alert_vm" {
name = "SuppressCPUAlertNonProdVM"
location = azurerm_resource_group.monitoring.location
resource_group_name = azurerm_resource_group.monitoring.name
data_flow {
streams = ["Microsoft-Perf"]
destinations = ["workspace"]
}
data_sources {
performance_counter {
streams = ["Microsoft-Perf"]
scheduled_transfer_period = "PT1M"
sampling_frequency = "PT1M"
counter_specifiers = ["\\Processor(_Total)\\% Processor Time"]
name = "cpuMetrics"
}
}
}
resource "azurerm_monitor_data_collection_rule_association" "suppress_cpu_alert_vm_association" {
name = "SuppressCPUAlertVM-NonProdApp"
target_resource_id = "/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Compute/virtualMachines/VM-NonProdApp"
data_collection_rule_id = azurerm_monitor_data_collection_rule.suppress_cpu_alert_vm.id
}
Log-based Alert Examples
Alert if Important Application Process Stops
An alert triggered when a critical process named “YourImportantProcessName” stops.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
resource "azurerm_monitor_scheduled_query_rules_alert" "important_process_stopped" {
name = "ImportantProcessStoppedAlert"
resource_group_name = azurerm_resource_group.monitoring.name
location = azurerm_resource_group.monitoring.location
data_source_id = azurerm_log_analytics_workspace.workspace.id
severity = 1
frequency = 5
time_window = 5
query = <<QUERY
Event
| where EventLog == "System" and (EventID == 7036 or EventID == 7034)
| where RenderedDescription has \"YourImportantProcessName\"
QUERY
trigger {
operator = "GreaterThan"
threshold = 0
}
action {
action_group = [azurerm_monitor_action_group.complex_action_group.id]
}
}
Multiple Failed Login Attempts
This log-based alert monitors security logs for signs of suspicious activity, such as multiple failed login attempts within a short timeframe on your virtual machines. It’s particularly useful for identifying brute-force attacks or unauthorized access attempts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
resource "azurerm_monitor_scheduled_query_rules_alert" "multiple_failed_logins" {
name = "MultipleFailedLoginAttempts"
resource_group_name = azurerm_resource_group.monitoring.name
location = azurerm_resource_group.monitoring.location
data_source_id = azurerm_log_analytics_workspace.workspace.id
severity = 1
frequency = 5
time_window = 5
query = <<QUERY
SecurityEvent
| where EventID == 4625 or EventID == 4625 // Windows failed logins
| summarize FailedAttempts = count() by Computer, bin(TimeGenerated, 5m)
| where FailedLoginAttempts > 10
QUERY
trigger {
operator = "GreaterThan"
threshold = 0
}
action {
action_group = [azurerm_monitor_action_group.complex_action_group.id]
}
}
With these foundational steps, you’ve configured basic Azure Monitor capabilities. In the next article in this series, we are going to explore some more advanced Alerts and monitoring strategies.
Azure Spring Clean 2025
This article is part of the Azure Spring Clean initiative, a community-driven event focused on sharing knowledge and best practices for Azure. Check out Azure Spring Clean for more insightful content from the Azure community.
Thanks for following along! Keep exploring, stay curious, and keep clouding around! 🚀☁️
Vukasin Terzic