ARM

ARM Template deployment bug in Azure Stack

I came across an interesting situation at a client when trying to deploy an ARM template that I have deployed a few times in the past successfully on both Azure and Azure Stack.  What the template deploys does'nt matter, but I came across a problem that you might encounter when deploying a Template with parameters, more specifically, how they're named.

I tried to deploy a template I modified back in January to a customer's Azure Stack stamp that was at the latest update version (1905) at the time of writing. 

The parameters looked like this:

2019-06-18 21_58_50-Parameters - Microsoft Azure Stack.png

When I tried to do a custom deployment, I got the following:

2019-06-18 22_00_05-Deploy Solution Template - Microsoft Azure Stack.png
2019-06-18 22_00_56-Errors - Microsoft Azure Stack.png

I tried to deploy the same template to Azure and it worked, so I knew the template was OK.  I also tried on a 1902 system and it worked.  Testing on a 1903 system and I got the error above again, so whatever change is causing the problem was introduced with that update and continues onwards.

After some trial and error,  doing a find/replace renaming the parameter to remove the '_' before the _artifactslocation &  _artifactsLocationSasToken in my templates. It wasn’t so obvious from the error message what the issue was, one of the joys of working with ARM!

Hopefully this issue gets fixed as _artifactsLocation and _artifactsLocationSasToken are classed as standard parameters per https://github.com/Azure/azure-quickstart-templates/blob/master/1-CONTRIBUTION-GUIDE/best-practices.md

Replace/Fix Unicode characters created by ConvertTo-Json in PowerShell for ARM Templates

unicode-escape-charactes.png

Often there will be instances where you may want to make a programmatic change to an ARM template using a scripting language like PowerShell. Parsing XML or JSON type files in the past wasn’t always the easiest thing via some scripting languages. Those of us who remember the days of parsing XML files using VBScript still occasionally wake up in the middle of the night in a cold sweat.

Thankfully, PowerShell makes this relatively easy for even non-developers through the use of some native conversion cmdlets that allow you to quickly take a JSON file and convert it to and from a custom PowerShell object that you can easily traverse and modify. You can do this with the ConvertFrom-Json cmdlet like so:

[powershell]$myJSONobject = Get-Content -raw "c:\myJSONfile.txt" | ConvertFrom-Json[/powershell]

Without digging into all the different ways that you can use this to tweak an ARM template to your heart’s satisfaction, it’s reasonable to assume that at some point you’d like to convert your newly modified object back into a plain ole’ JSON file with the intention of submitting it back into Azure, and that’s when things go awry. When you use the intuitively named ConvertTo-Json cmdlet to convert your object back into a flat JSON text block, your gorgeous block of JSON content that once looked like this:

Now looks like this:

See the problem? Here it is again with some helpful highlighting:

All your special characters get replaced by Unicode escape characters. While PowerShell’s ConvertFrom-Json cmdlet and other parsers handle these codes just fine when converting from text, Azure will have none of this nonsense and your template will fail to import/deploy/etc. Fortunately, the fix is very simple. The Regex class in the .NET framework has an “unescape” method:

https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.110).aspx

So instead of using:

[powershell]$myOutput = $myJSONobject | ConvertTo-Json -Depth 50[/powershell]

Use:

[powershell]$myOutput = $myJSONobject | ConvertTo-Json -Depth 50 | % { [System.Text.RegularExpressions.Regex]::Unescape($_) }[/powershell]

The output from the second command should generate a perfectly working ARM template. That is, of course, assuming that you correctly built your ARM template in the first place, but I take no responsibility for that. ;)

ExpressRoute Migration from ASM to ARM and legacy ASM Virtual Networks

word-image9.png

I recently ran into an issue where an ExpressRoute had been migrated from Classis (ASM) to the new portal (ARM), however legacy Classic Virtual Networks (VNets) were still in operation. These VNets refused to be deleted by either portal or PowerShell. Disconnecting the old VNet’s Gateway through the Classic portal would show success, but it would stay connected.

There’s no option to disconnect an ASM gateway in the ARM portal, only a delete option. Gave this a shot and predictably, this was the result:

C:\Users\will.van.allen\AppData\Local\Microsoft\Windows\INetCache\Content.Word\FailedDeleteGW.PNG

Ok, let’s go to PowerShell and look for that obstinate link. Running Get-AzureDedicatedCircuitLink resulted in the following error:

PS C:\> get-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

get-AzureDedicatedCircuitLink : InternalError: The server encountered an internal error. Please retry the request.

At line:1 char:1

+ get-AzureDedicatedCircuitLink -ServiceKey xxxxxx-xxxx-xxxx-xxxx-xxx...

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+ CategoryInfo          : CloseError: (:) [Get-AzureDedicatedCircuitLink], CloudException

+ FullyQualifiedErrorId : Microsoft.WindowsAzure.Commands.ExpressRoute.GetAzureDedicatedCircuitLinkCommand

I couldn’t even find the link. Not only was modifying the circuit an issue, but reads were failing, too.

Turned out to be a simple setting change. When the ExpressRoute was migrated, as there were still Classic VNets, a final step of enabling the circuit for both deployment models was needed. Take a look at the culprit setting here, after running Get-AzureRMExpressRouteCircuit:

"serviceProviderProperties": {

"serviceProviderName": "equinix",

"peeringLocation": "Silicon Valley",

"bandwidthInMbps": 1000

},

"circuitProvisioningState": "Disabled",

"allowClassicOperations": false,

"gatewayManagerEtag": "",

"serviceKey": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",

"serviceProviderProvisioningState": "Provisioned"

AllowClassicOperations set to “false” blocks ASM operations from any access, including a simple “get” from the ExpressRoute circuit. Granting access is straightforward:

# Get details of the ExpressRoute circuit

$ckt = Get-AzureRmExpressRouteCircuit -Name "DemoCkt" -ResourceGroupName "DemoRG"

#Set "Allow Classic Operations" to TRUE

$ckt.AllowClassicOperations = $true

More info on this here.

But we still weren’t finished. I could now get a successful response from this:

get-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

However this still failed:

Remove-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

So reads worked, but no modify. Ah—I remembered the ARM portal lock feature, and sure enough, a Read-Only lock on the Resource Group was inherited by the ExpressRoute (more about those here). Once the lock was removed, voila, I could remove the stubborn VNets no problem.

# Remove the Circuit Link for the Vnet

Remove-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

# Disconnect the gateway

Set-AzureVNetGateway -Disconnect –VnetName $Vnet –LocalNetworkSiteName <LocalNetworksitename>

# Delete the gateway

Remove-AzureVNetGateway –VNetName $Vnet

There’s still no command to remove a single Vnet, you have to use the portal (either will work) or you can use PowerShell to edit the NetworkConfig.xml file, then import it.

Once our legacy VNets were cleaned up, I re-enabled the Read-Only lock on the ExpressRoute.

In summary, nothing was “broken”, just an overlooked setting. I would recommend cleaning up your ASM/Classic Vnets before migrating your ExpressRoute, it’s so much easier and cleaner, but if you must leave some legacy virtual networks in place, remember to set the ExpressRoute setting “allowclassicoperations” setting to “True” after the migration is complete.

And don’t forget those pesky ARM Resource Group locks.

Automating OMS Searches, Schedules, and Alerts in Azure Resource Manager Templates

OMS-Icon.png

Recently, I had an opportunity to produce a proof of concept around leveraging Operations Management Suite (OMS) as a centralized point of monitoring, analytics, and alerting.  As in any good Azure project, codifying your Azure resources in an Azure Resource Manager template should be table stakes. While the official ARM schema doesn't yet have all the children for the Microsoft.OperationalInsights/workspaces type, the Log Analytics documentation does document samples for interesting use cases that demonstrate data sources and saved searches.  Notably absent from this list, however, is anything about the /schedules and /action types. Now, a clever Azure Developer will realize that an ARM template is basically an orchestration and expression wrapper around the REST APIs.  In fact, I've yet to see a case where an ARM template supports a type that's not provided by those APIs.  With that knowledge and the alert API samples, we can adapt the armclient put <url> examples to an ARM template.

The Resources

What you see as an alert in the OMS portal is actually 2 to 3 distinct resources in the ARM model:

A mandatory schedules type, which defines the period and time window for the alert

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Search-Id')]"
  ],
  "properties": {
    "Interval": 5,
    "QueryTimeSpan": 5,
    "Active": "true"
  }
}

A mandatory actions type, defining the trigger and email notification settings

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Alert-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Schedule-Id')]"
  ],
  "properties": {
    "Type": "Alert",
    "Name": "[parameters('Search-DisplayName')]",
    "Threshold": {
      "Operator": "gt",
      "Value": 0
    },
    "Version": 1
  }
}

An optional, secondary, actions type, defining the alert's webhook settings

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Action-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Schedule-Id')]"
  ],
  "properties": {
    "Type": "Webhook",
    "Name": "[variables('Action-Name')]",
    "WebhookUri": "[parameters('Action-Uri')]",
    "CustomPayload": "[parameters('Action-Payload')]",
    "Version": 1
  }
}

The Caveats

The full template adds a few eccentricities that are necessary for fully functioning resources.

First, let's talk about the search.  At the OMS interface, when you create a search, it is named <Category>|<Name> behind the scenes.  It is also lower-cased.  Now, the APIs will not downcase any name you send in, but if you use the interface to update that API-deployed search, you will duplicate it rather than being prompted to overwrite.  This tells us that for whatever reason, OMS' identifiers are case-sensitive.  This is why the template accepts a search category and name in parameters, then use variables to lower them and match the interface's naming convention.

Second, a schedule's name must be globally unique to your workspace, though the actions need not be.  Again, the OMS UI will take over here and generate GUIDs for the resources behind the scenes.  Unfortunately for us, ARM Templates don't support random number or GUID generation - a necessary evil of being deterministic.  While the template could accept a GUID by parameter, I find that asking my users for GUIDs is impolite - a uniquestring() function on the already-unique search name should suffice for most use cases.  Since the actions don't require uniqueness, static names will do.

Lastly, this template is not idempotent.  Idempotency is a characteristic of a desired state technology that focuses on ensuring the state (in our case, the template), rather than prescribing process (create this, update that, delete the other thing).  Each of Azure's types is responsible for ensuring that they implement idempotency as much as possible.  For some resources like the database extensions type, which imports .bacpac files, the entire type opts out of idempotency (once data is in the database, no more importing).  For others (such as Azure VMs), only changes in certain properties like administrator name/password break idempotency.  In the OMS search and schedule's case, if either one exists when you deploy the template, the deployment will fail with a code of BadRequest.  To add insult to injury, deleting the saved search from the interface does not delete the schedule.  Neither does deleting the alert from the UI - it simply flips the scheduled to "Enabled": false.  There is, in fact, no way to delete the schedule via the UI - you must do so via the REST API, similar to the following:

armclient delete /{Schedule ResourceID}?api-version=2015-03-20

This means that, for now, any versioned solution which desires to deploy ARM template updates to its searches and alerts will need to either:

  1. Delete the existing search/schedule, then redeploy OR
  2. Deploy the updated version side-by-side (named differently), then clean up the prior version's items.

Luckily ours was a POC, so deleting and re-creating was very much an option.  I look forward to days when search and alerting types become first-class, idempotent citizens.  Until then, I hope this approach proves useful!

The Template

The template below deploys a search, a schedule, a threshold action (the alert), and a webhook action (the action) that includes a custom JSON payload (remember to escape your JSON!).  With a little work, this template could be reworked to segregate the alert-specific components from the search so that multiple alerts could be deployed to the same search or adapted for use with Azure Automation Runbooks.  Enjoy!

{
  "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "Action-Payload": {
      "type": "string"
    },
    "Action-Uri": {
      "type": "string"
    },
    "Alert-DisplayName": {
      "type": "string"
    },
    "Search-Category": {
      "type": "string"
    },
    "Search-DisplayName": {
      "type": "string"
    },
    "Search-Query": {
      "type": "string"
    },
    "Workspace-Name": {
      "type": "string"
    },
    "Workspace-Location": {
      "type": "string"
    }
  },
  "variables": {
    "Action-Name": "webhook",
    "Alert-Name": "alert",
    "Schedule-Name": "[uniquestring(variables('Search-Name'))]",
    "Search-Name": "[toLower(concat(variables('Search-Category'), '|', parameters('Search-DisplayName')))]",
    "Search-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches', parameters('Workspace-Name'), variables('Search-Name'))]",
    "Schedule-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'))]",
    "Alert-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'), variables('Alert-Name'))]",
    "Action-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'), variables('Action-Name'))]"
  },
  "resources": [
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [],
      "properties": {
        "Category": "[parameters('Search-Category')]",
        "DisplayName": "[parameters('Search-DisplayName')]",
        "Query": "[parameters('Search-Query')]",
        "Version": "1"
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Search-Id')]"
      ],
      "properties": {
        "Interval": 5,
        "QueryTimeSpan": 5,
        "Active": "true"
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Alert-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Schedule-Id')]"
      ],
      "properties": {
        "Type": "Alert",
        "Name": "[parameters('Search-DisplayName')]",
        "Threshold": {
          "Operator": "gt",
          "Value": 0
        },
        "Version": 1
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Action-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Schedule-Id')]"
      ],
      "properties": {
        "Type": "Webhook",
        "Name": "[variables('Action-Name')]",
        "WebhookUri": "[parameters('Action-Uri')]",
        "CustomPayload": "[parameters('Action-Payload')]",
        "Version": 1
      }
    }
  ],
  "outputs": {}
}

Ruling Your Resource Groups with an Iron Fist

grim-reaper4.png

If it is not obvious by now, we deploy a lot of resources to Azure.  The rather expensive issue we encounter is that people rarely remember to clean up after themselves; it goes without saying we have encountered some staggeringly large bills.  To remedy this, we enforced a simple, yet draconian policy around persisting Resource Groups.  If your Resource Group does not possess values for our required tag set (Owner and Solution), it can be deleted at any time. At the time the tagging edict went out we had well over 1000 resource groups.  As our lab manager began to script the removal using the Azure Cmdlets we encountered a new issue, the synchronous operations of the Cmdlets just took too long.  We now use a little script we call "The Reaper" to "fire and forget".

If you read my previous post about Azure AD and Powershell, you may have noticed my predilection for doing things via the REST API.  "The Reaper" simply uses the module from that post to obtain an authorization token and uses the REST API to evaluate Resource Groups for tag compliance and delete the offenders asynchronously. There are a few caveats to note; it will only work with organizational accounts and it will not delete any resources which have a lock.

It is published on the PowerShell gallery, so if you can obtain it like so:

[powershell] #Just download it Save-Script -Name thereaper -Path "C:\myscripts" #Install the script (with the module dependency) Install-Script -Name thereaper [/powershell]

The required parameters are a PSCredential and an array of strings for the tags to inspect. Notable Switch parameters are AllowEmptyTags (only checks for presence of the required tags) and DeleteEmpty (removes Resource Groups with no Resources even if tagged properly). There is also a SubscriptionFilters parameter taking an array of Subscription id's to limit the scope, otherwise all Subscriptions your account has access to will be evaluated. If you would simply like to see what the results would be, use the WhatIf Switch. Usage is as follows:

[powershell] $Credential=New-Object PSCredential("username@tenant.com",("YourPassword"|ConvertTo-SecureString -AsPlainText -Force)) $results=.\thereaper.ps1 -Credential $Credential ` -RequiredTags "Owner","Solution" -DeleteEmpty ` -SubscriptionFilters "49f4ba3e-72ec-4621-8e9e-89d312eafd1f","554503f6-a3aa-4b7a-a5a9-641ac65bf746" [/powershell]

A standard liability waiver applies; I hold no responsibility for the Resource Groups you destroy.