Express Route

ExpressRoute Migration from ASM to ARM and legacy ASM Virtual Networks

word-image9.png

I recently ran into an issue where an ExpressRoute had been migrated from Classis (ASM) to the new portal (ARM), however legacy Classic Virtual Networks (VNets) were still in operation. These VNets refused to be deleted by either portal or PowerShell. Disconnecting the old VNet’s Gateway through the Classic portal would show success, but it would stay connected.

There’s no option to disconnect an ASM gateway in the ARM portal, only a delete option. Gave this a shot and predictably, this was the result:

C:\Users\will.van.allen\AppData\Local\Microsoft\Windows\INetCache\Content.Word\FailedDeleteGW.PNG

Ok, let’s go to PowerShell and look for that obstinate link. Running Get-AzureDedicatedCircuitLink resulted in the following error:

PS C:\> get-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

get-AzureDedicatedCircuitLink : InternalError: The server encountered an internal error. Please retry the request.

At line:1 char:1

+ get-AzureDedicatedCircuitLink -ServiceKey xxxxxx-xxxx-xxxx-xxxx-xxx...

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+ CategoryInfo          : CloseError: (:) [Get-AzureDedicatedCircuitLink], CloudException

+ FullyQualifiedErrorId : Microsoft.WindowsAzure.Commands.ExpressRoute.GetAzureDedicatedCircuitLinkCommand

I couldn’t even find the link. Not only was modifying the circuit an issue, but reads were failing, too.

Turned out to be a simple setting change. When the ExpressRoute was migrated, as there were still Classic VNets, a final step of enabling the circuit for both deployment models was needed. Take a look at the culprit setting here, after running Get-AzureRMExpressRouteCircuit:

"serviceProviderProperties": {

"serviceProviderName": "equinix",

"peeringLocation": "Silicon Valley",

"bandwidthInMbps": 1000

},

"circuitProvisioningState": "Disabled",

"allowClassicOperations": false,

"gatewayManagerEtag": "",

"serviceKey": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",

"serviceProviderProvisioningState": "Provisioned"

AllowClassicOperations set to “false” blocks ASM operations from any access, including a simple “get” from the ExpressRoute circuit. Granting access is straightforward:

# Get details of the ExpressRoute circuit

$ckt = Get-AzureRmExpressRouteCircuit -Name "DemoCkt" -ResourceGroupName "DemoRG"

#Set "Allow Classic Operations" to TRUE

$ckt.AllowClassicOperations = $true

More info on this here.

But we still weren’t finished. I could now get a successful response from this:

get-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

However this still failed:

Remove-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

So reads worked, but no modify. Ah—I remembered the ARM portal lock feature, and sure enough, a Read-Only lock on the Resource Group was inherited by the ExpressRoute (more about those here). Once the lock was removed, voila, I could remove the stubborn VNets no problem.

# Remove the Circuit Link for the Vnet

Remove-AzureDedicatedCircuitLink -ServiceKey $ServiceKey -VNetName $Vnet

# Disconnect the gateway

Set-AzureVNetGateway -Disconnect –VnetName $Vnet –LocalNetworkSiteName <LocalNetworksitename>

# Delete the gateway

Remove-AzureVNetGateway –VNetName $Vnet

There’s still no command to remove a single Vnet, you have to use the portal (either will work) or you can use PowerShell to edit the NetworkConfig.xml file, then import it.

Once our legacy VNets were cleaned up, I re-enabled the Read-Only lock on the ExpressRoute.

In summary, nothing was “broken”, just an overlooked setting. I would recommend cleaning up your ASM/Classic Vnets before migrating your ExpressRoute, it’s so much easier and cleaner, but if you must leave some legacy virtual networks in place, remember to set the ExpressRoute setting “allowclassicoperations” setting to “True” after the migration is complete.

And don’t forget those pesky ARM Resource Group locks.

The Concept of Cloud Adjacency

datacenter.jpg

Several years ago, when we first started theorizing around hybrid cloud, it was clear that to do our definition of hybrid cloud properly, you needed to have a fast, low latency connection between your private cloud and the public clouds you where using. After talking to major enterprise customers we theorized five cases where this type of arrangement made sense, and actually accelerated cloud adoption.

  1. Moving workloads but not data - If you expected to move workloads between cloud providers, and between public and private to take advantage of capacity and price moves, moving the workloads quickly and efficiently meant, in some cases, not moving the data.
  2. Regulatory and Compliance Reasons - For regulatory and compliance reasons, say the data is sensitive and must remain under corporate physical ownership wile at rest, but you still want to use the cloud for apps around that data (including possible front ending it).
  3. Application Support - A case where you have a core application that can’t move to cloud due to support requirements (EPIC Health is an example of this type of application), but you have other surround applications that can move to cloud, but need a fast, low latency connection back to this main application.
  4. Technical Compatibility  - A case where there is a technical requirement, say you need real time storage replication to a DR site, or very high IOPS, and Azure can’t for one reason or another handle the scenario, in this case that data can be placed on a traditional storage infrastructure and presented to the cloud over the high bandwidth low latency connection.
  5. Cost – There are some workloads which today are significantly more expensive to run in cloud then on existing large on premise servers. These are mostly very large compute platforms that scale up rather then out. Especially in cases where a customer already owns this type of equipment and it isn't fully depreciated. Again, it may make sense to run surround systems in public cloud that run on commodity two socket boxes, wile retaining the large 4 socket, high memory instances until fully depreciated.

 

All of these scenarios require a low latency high bandwidth connection to the cloud, as well as a solid connection back to the corporate network. In these cases you have a couple options for getting this type of connection.

  1. Place the equipment into your own datacenter, and buy high bandwidth Azure express route and if needed Amazon Direct Connect connections from an established Carrier like AT&T, Level3, Verizon, etc.
  2. Place the equipment into a carrier colocation facility and create a high bandwidth connection from there back to your premise. This places the equipment closer to the cloud but introduces latency between you and the equipment. This works because most applications assume latency between the user and the application, but are far less tolerant of latency between the application tier and the database tier or between the database and its underlying storage. Additionally you can place a traditional WAN Accelerator (Riverbed or the like) on the link between the colocation facility and your premise.

 

For simply connecting your network to Azure, both options work equally well. If you have one of the 5 scenarios above however, the later options (colocation) is better. Depending on the colocation provider, the latency will be low to very low (2-40ms depending on location). All of the major carriers offer a colocation arrangement that is close to the cloud edge. At Avanade, I run a fairly large hybrid cloud lab environment, which allows us to perform real world testing and demonstrations of these concepts. In our case we’ve chosen to host the lab in one of these colocation providers, Equinix. Equinix has an interesting story in that they are a carrier neutral facility. In other words, they run the datacenter and provide a brokerage (The Cloud Exchange), but don’t actually sell private line circuits. You connect thru their brokerage directly to the other party. This is interesting because it means I don’t pay for a 10gig local loop fee, i simply pay for access to the exchange. Right now I buy 2 10gig ports to the exchange, and over those I have 10gigs of express route and 10 gigs of Amazon Direct connect delivered to my edge routers. The latency to a VM running within my private cloud in the lab and a VM in Amazon or azure is sub 2ms. This is incredibly fast. Fast enough to allow me to showcase each of the above scenarios.

 

We routinely move workloads between providers without moving the underlying data by presenting the data disks over iscsi from our Netapp FAS to the VM running at the cloud provider. When we move the VM, we migrate the OS and system state, but the data disk is simply detached and reattached at the new location. This is possible due to the low latency connection and is a capability that Netapp sells as Netapp Private Storage or NPS. This capability is also useful for the regulatory story of keeping data at rest under our control. The data doesn't live in the cloud provider and we can physically pinpoint its location at all times. Further this meets some of the technical compatibility scenario. Because my data is back ended on a capable SAN with proper SAN based replication and performance characteristics, I can run workloads that I may not otherwise have been able to run in cloud due to IOPS, cost for performance or feature challenges with cloud storage.

 

Second, I have compute located in this adjacent space, some of this compute are very large multi TB of RAM, quad socket machines running very large line of business applications. In this case those LOB applications have considerable surround applications that can run just fine on public cloud compute, but this big one isn’t cost effective to run on public cloud since I've already invested in this hardware. Or perhaps I’m not allowed to move it to public due to support constraints. Another example of this is non x86 based platforms. Lets say I have an IBM POWER or mainframe platform. I don’t want to retire it or re platform it due to cost, but I have a number of x86 based surround applications that I’d like to move to cloud. I can place my mainframe within the cloud adjacent space, and access those resources as if they too were running in the public cloud.

 

As you can see, cloud adjacent space opens up a number of truly hybrid scenarios. We’re big supporters of the concept when it makes sense, and wile there are complexities, it can easily unlock move additional workloads to public cloud.