Azure Regions and Availability Zones sometimes experience capacity constraints, particularly noticeable with specialized or larger VM SKUs. This issue frequently arises with SKUs equipped with dedicated graphics cards (GPUs) or those optimized for intensive workloads such as SAP HANA. But sometimes also unexpectedly for standard D and other VM series.
Capacity Availability vs. Subscription Quotas
Let’s first understand the difference between different limitation scenarios:
Subscription Quota Limitations: Azure imposes quota limits on your subscription to manage overall resource utilization. These limits control how many VMs of a certain type you can provision concurrently. If you encounter these limits, Azure provides a mechanism to request an increase by submitting a capacity request via an Azure support ticket. Such requests usually take a short amount of time to review and approve.
Actual Physical Capacity Availability: Even if your subscription quota is adequate, there can be instances where the actual physical hardware supporting your desired VM SKU is fully utilized in your selected Azure Region or Zone. In these scenarios, your provisioning attempts will fail, not due to quota limits but because the physical infrastructure has reached its maximum capacity.
For a detailed exploration of Azure Zones, please refer to my previous article Understanding Physical and Logical Azure Availability Zones.
Verifying SKU Availability in Specific Zones
Not all VM SKUs are universally available across every Zone within a Region. Some SKUs may not be available in your Azure Region at all, or might be limited to specific zones due to hardware distribution or operational considerations. So let’s start with verifying the availability of specific SKUs in your region and Zone. I wrote this helpful script that can do that for you:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
function Get-VMSKUAvailabilityInZones {
param (
[Parameter(Mandatory=$true)]
[string]$Region,
[Parameter(Mandatory=$true)]
[string]$SKU,
[Parameter(Mandatory=$false)]
[string[]]$Zones
)
# Get the SKU availability in the specified region
$skuAvailability = Get-AzComputeResourceSku | Where-Object {
$_.Locations -contains $Region -and $_.Name -eq $SKU
}
if ($null -eq $skuAvailability) {
Write-Output "SKU $SKU is not available in region $Region."
return
}
# If Zones parameter is not provided, get all available zones in the region
if (-not $Zones) {
$Zones = $skuAvailability.LocationInfo | Where-Object {
$_.Location -eq $Region
} | ForEach-Object {
$_.Zones
} | Select-Object -Unique
}
$resultTable = @()
# Check availability in specified or all zones
foreach ($zone in $Zones) {
$zoneAvailability = $skuAvailability.LocationInfo | Where-Object {
$_.Location -eq $Region -and $_.Zones -contains $zone
}
$isAvailable = if ($null -ne $zoneAvailability) { $true } else { $false }
$resultTable += [PSCustomObject]@{
SKU = $SKU
Region = $Region
Zone = $zone
SKUAvailableInZone = $isAvailable
}
}
# Output the result table
$resultTable | Format-Table -AutoSize
}
# Example usage
#Get-VMSKUAvailabilityInZones -Region "eastus2" -SKU "Standard_D2s_v3" -Zones @("1", "2", "3")
Get-VMSKUAvailabilityInZones -Region "eastus2" -SKU "Standard_E16ds_v5"
The output of this script will look like this:
SKU | Region | Zone | SKUAvailableInZone |
---|---|---|---|
Standard_E16ds_v5 | eastus2 | 1 | True |
Standard_E16ds_v5 | eastus2 | 2 | True |
Standard_E16ds_v5 | eastus2 | 3 | True |
Despite a VM SKU appearing available in Azure documentation or initial script queries, you may still encounter actual capacity constraints. A quick, manual method to verify real-time availability is attempting to provision the VM directly from the Azure Portal. However, this is not practical for automation scenarios, large-scale migrations, or Infrastructure as Code (IaC) deployments.
To automate this verification, you can utilize thisscript I designed explicitly for programmatic checks at scale:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
function Get-VmMKUCapacityAvailability {
param (
[Parameter(Mandatory=$true)]
[string]$Region,
[Parameter(Mandatory=$true)]
[string]$SKU,
[Parameter(Mandatory=$false)]
[string[]]$Zones
)
# If Zones parameter is not provided, get all available zones in the region
if (-not $Zones) {
$Zones = @("1", "2", "3") # Edit this if Region has different number of Zones
}
# Initialize result table
$resultTable = @()
# Prepare credentials
$VMLocalAdminUser = "WhatIfUser"
$VMLocalAdminSecurePassword = ConvertTo-SecureString -String "WhatIfPassword" -AsPlainText -Force
$Credential = New-Object System.Management.Automation.PSCredential ($VMLocalAdminUser, $VMLocalAdminSecurePassword)
# Check VM capacity in specified or all zones
foreach ($zone in $Zones) {
$vmParams = @{
ResourceGroupName = "WhatIfResourceGroup"
Location = $Region
Size = $SKU
Name = "WhatIfVM"
Zone = $zone
ImageName = "Win2022AzureEdition"
VirtualNetworkName = "WhatIfVNet"
SubnetName = "WhatIfSubnet"
SecurityGroupName = "WhatIfNSG"
PublicIpAddressName = "WhatIfPublicIP"
Credential = $Credential
WhatIf = $true
}
try {
New-AzVM @vmParams
$isAvailable = $true
} catch {
$isAvailable = $false
}
$resultTable += [PSCustomObject]@{
SKU = $SKU
Region = $Region
Zone = $zone
SKUAvailableInZone = $isAvailable
}
}
# Output the result table
$resultTable
}
# Example usage
#Check-VmCapacityAvailability -Region "eastus2" -SKU "Standard_D2s_v3" -Zones @("1", "2", "3")
Get-VmMKUCapacityAvailability -Region "eastus2" -SKU "Standard_E16ds_v5"
This script will confirm that is actually possible to create VM in slected or all Zones and return the result like this:
SKU | Region | Zone | SKUAvailableInZone |
---|---|---|---|
Standard_E16ds_v5 | eastus2 | 1 | True |
Standard_E16ds_v5 | eastus2 | 2 | True |
Standard_E16ds_v5 | eastus2 | 3 | True |
Handling VM SKU Unavailability
When facing VM SKU unavailability due to physical capacity exhaustion, you have limited yet critical options:
- Wait and Retry: Capacity availability constantly changes as other customers provision or deallocate resources. Re-attempting after a period may yield better results.
- Engage Microsoft Support: Contacting your Microsoft representative or support can sometimes result in securing priority or additional capacity allocation for critical workloads.
Proactive Measures and Best Practices
To effectively navigate capacity challenges, consider the following proactive strategies:
Proximity Placement Groups (PPGs): Utilizing PPGs can sometimes help Azure provision your requested VM SKUs more reliably by optimizing VM placement within a particular Region or Zone, reducing the likelihood of encountering capacity issues.
Azure Reservations: For frequently utilized or mission-critical VMs with limited availability, Azure Reservations are highly recommended. Reservations ensure that the allocated physical hardware resources remain dedicated to your subscription. Without reservations, temporary deallocation—for example, during Disaster Recovery (DR) scenarios—could result in losing your previously allocated resources to another customer, leaving you unable to start your VMs again.
Flexibility in VM SKU Selection: Whenever feasible, build flexibility into your infrastructure strategy by identifying multiple VM SKUs that can fulfill your workload requirements. Being adaptable in your SKU choices helps you mitigate risks associated with specific SKU shortages. This can be difficult when you need solution-certified VM SKUs, but even then Microsoft ensures that there are at least few different options available.
Cross-Region or Cross-Zone Redundancy: Distributing workloads across multiple regions or zones can significantly reduce the impact of local capacity limitations. Employing strategies like regional redundancy or multi-zone architectures enhances both capacity availability and disaster recovery capabilities.
Monitoring and Alerts: Set up monitoring alerts using Azure Monitor to proactively identify trends or unexpected spikes in resource usage, which can help in predicting potential capacity constraints before they become critical.
Understand Azure Announcements and Updates: Regularly review Azure updates and regional announcements regarding new capacity additions or SKU retirements. Staying informed allows better preparation for future constraints and optimal planning.
Effectively managing Azure capacity challenges involves understanding the nuances between subscription quotas and physical capacity limitations, proactively checking VM SKU availability, and adopting strategies like using Proximity Placement Groups, Azure Reservations, and flexible SKU choices. By combining these best practices with vigilant monitoring and staying informed about Azure updates, you can significantly reduce the risk of disruptions, ensuring your workloads remain highly available, reliable, and optimized in the cloud.
I hope this was useful. Thanks for reading, and keep clouding around!
Vukasin Terzic