Simulating an Azure storage account failure

redundancy_banner.jpg

Storage Redundancy in the Cloud Redundancy and failover is always an important factor when designing and deploying applications.  As we start to build out applications in the cloud we have seen major disruptions due to a single Azure storage account being used across and entire application.  Logically this makes sense as a single container for all items when considering redundancy this becomes a single point of failure.  Just because it is in the cloud doesn't mean it's redundant.

While Azure storage is generally something you can consider stable, this is IT and anything that can happen, will happen.  We have seen developers and administrators accidentally deleting accounts and Azure has had outages which include storage account failures and in some cases data loss although far less common.  At this point I should mention deployment and use of RBAC might have prevented some of these accidental deletions, but not in every case.

An Example Application

In this example let's consider you are building an application with two web front end servers using IaaS VMs you would like to ensure is as redundant as possible.  We could use an Azure load balancers and deploy two VMs into an availability set and as shown below.  While the load balancer handles traffic and the availability set handles fault and upgrade domains for the VMs, these VMs are still all present on a single storage account that is outside of the availability set protection. If you lose the storage account both of the VMs will fail and your application will go offline.

Capture1_thumb.jpg

Adding Redundancy

If we take this design and add a second storage account for one of the IaaS VMs we can eliminate several scenarios where the application might go offline.  There are several options for redundancy in the storage which you can evaluate depending on your needs, budget and performance.  As well there are limits to the number of storage accounts you can provision and management could also become more complex.

Capture2.jpg

This recommendation is focusing on a single storage account going 'offline' for whatever reason. As you scale up to larger applications you may want to have two or more storage accounts supporting multiple VMs. To reduce the number of storage accounts you could consider striping storage accounts across multiple load balanced application tiers. It's worth noting, that this will help protect against accidental deletion, but even having two storage accounts may not protect you against Azure failures. There is no garantee that a second storage account wont be on the same hardware, or otherwise within the same failure envelope as the first. Its better then nothing, but if you need a higher RTO/RPO, you need to look at proper active/active configuration in seperate regions.

While LRS is the recommended strategy for VHDs to increase performance and reduce costs you may want to consider more resilient options and use ZRS or GRS storage replication for at least one of the storage accounts. Note, there are limitations to using ZRS and GRS as a VMs, specifically around performance and corruption when disk striping. You may even consider deploying more VMs in another availability set to another region depending on the applications requirements.

Simulating Storage Account Failure 

As an administrator, if you are trying to test a redundant application, there is no ‘offline’ option to allow us to test storage failure.  There are some options to simulate this, if you break the lease to the blob you can simulate a hard stop, but this is potentially destructive to the OS. You can find more information about breaking a lease at this Microsoft Reference. You could stop and start the VMs yourself but as the application scales and grows more complex, again you can introduce human error and you still have to go and start them all again anyway. To help with this administrative task, I have modified a script for stopping and starting VMs.

Script and Source

Using an existing script I made some minor modifications and combined code from Darren Robinson here that utilizes RamblingCookieMonster's invoke-parallel function and put them into a single script.

This script allows the administrator to specify a resource group and a storage account, the script will find all VMs on that storage account in that resource group and shutdown the VMs gracefully.  The invoke-parallel allow for tasks to be run at the same time, saving time.  You can then conduct application testing. Once testing is complete, you can use the same script to start the VMs again.

The script Change-VMStateByStorageAccount will ask for your Azure Credentials if they aren't present in your PowerShell session. The script itself requires three parameters as follows (ResourceGroup, StorageAccount, Power (Stop or Start))

Example to stop all VMs

[powershell] Change-VMStateByStorageAccount -ResourceGroup "MyResourceGroup" -StorageAccount "StorageAccount01" -Power "Stop" [/powershell]

PowerShell Code

[powershell] Param( [Parameter(Mandatory=$true)] [String] $ResourceGroup, [Parameter(Mandatory=$true)] [String] $StorageAccount, [Parameter(Mandatory=$true)] [String] $Power )

$StorageSuffix = "blob.core.windows.net"

if (!$Power){Write-host "No powerstate specified. Use -Power start|stop"} if (!$ResourceGroup){Write-host "No Azure Resource Group specified. Use -ResourceGroup 'ResourceGroupName'"} if (!$StorageAccount){Write-host "No Azure Storage Accout name specified. Use -StorageAccount 'storageaccount'"}

function Invoke-Parallel { [cmdletbinding(DefaultParameterSetName='ScriptBlock')] Param ( [Parameter(Mandatory=$false,position=0,ParameterSetName='ScriptBlock')] [System.Management.Automation.ScriptBlock]$ScriptBlock,

[Parameter(Mandatory=$false,ParameterSetName='ScriptFile')] [ValidateScript({test-path $_ -pathtype leaf})] $ScriptFile,

[Parameter(Mandatory=$true,ValueFromPipeline=$true)] [Alias('CN','__Server','IPAddress','Server','ComputerName')] [PSObject]$InputObject,

[PSObject]$Parameter,

[switch]$ImportVariables,

[switch]$ImportModules,

[int]$Throttle = 20,

[int]$SleepTimer = 200,

[int]$RunspaceTimeout = 0,

[switch]$NoCloseOnTimeout = $false,

[int]$MaxQueue,

[validatescript({Test-Path (Split-Path $_ -parent)})] [string]$LogFile = "C:\temp\log.log",

[switch] $Quiet = $false )

Begin {

#No max queue specified? Estimate one. #We use the script scope to resolve an odd PowerShell 2 issue where MaxQueue isn't seen later in the function if ( -not $PSBoundParameters.ContainsKey('MaxQueue')) { if($RunspaceTimeout -ne 0){ $script:MaxQueue = $Throttle } else{ $script:MaxQueue = $Throttle * 3 } } else { $script:MaxQueue = $MaxQueue }

Write-Verbose "Throttle: '$throttle' SleepTimer '$sleepTimer' runSpaceTimeout '$runspaceTimeout' maxQueue '$maxQueue' logFile '$logFile'"

#If they want to import variables or modules, create a clean runspace, get loaded items, use those to exclude items if ($ImportVariables -or $ImportModules) { $StandardUserEnv = [powershell]::Create().addscript({

#Get modules and snapins in this clean runspace $Modules = Get-Module | Select -ExpandProperty Name $Snapins = Get-PSSnapin | Select -ExpandProperty Name

#Get variables in this clean runspace #Called last to get vars like $? into session $Variables = Get-Variable | Select -ExpandProperty Name

#Return a hashtable where we can access each. @{ Variables = $Variables Modules = $Modules Snapins = $Snapins } }).invoke()[0]

if ($ImportVariables) { #Exclude common parameters, bound parameters, and automatic variables Function _temp {[cmdletbinding()] param() } $VariablesToExclude = @( (Get-Command _temp | Select -ExpandProperty parameters).Keys + $PSBoundParameters.Keys + $StandardUserEnv.Variables ) Write-Verbose "Excluding variables $( ($VariablesToExclude | sort ) -join ", ")"

# we don't use 'Get-Variable -Exclude', because it uses regexps. # One of the veriables that we pass is '$?'. # There could be other variables with such problems. # Scope 2 required if we move to a real module $UserVariables = @( Get-Variable | Where { -not ($VariablesToExclude -contains $_.Name) } ) Write-Verbose "Found variables to import: $( ($UserVariables | Select -expandproperty Name | Sort ) -join ", " | Out-String).`n"

}

if ($ImportModules) { $UserModules = @( Get-Module | Where {$StandardUserEnv.Modules -notcontains $_.Name -and (Test-Path $_.Path -ErrorAction SilentlyContinue)} | Select -ExpandProperty Path ) $UserSnapins = @( Get-PSSnapin | Select -ExpandProperty Name | Where {$StandardUserEnv.Snapins -notcontains $_ } ) } }

#region functions

Function Get-RunspaceData { [cmdletbinding()] param( [switch]$Wait )

#loop through runspaces #if $wait is specified, keep looping until all complete Do {

#set more to false for tracking completion $more = $false

#Progress bar if we have inputobject count (bound parameter) if (-not $Quiet) { Write-Progress -Activity "Running Query" -Status "Starting threads"` -CurrentOperation "$startedCount threads defined - $totalCount input objects - $script:completedCount input objects processed"` -PercentComplete $( Try { $script:completedCount / $totalCount * 100 } Catch {0} ) }

#run through each runspace. Foreach($runspace in $runspaces) {

#get the duration - inaccurate $currentdate = Get-Date $runtime = $currentdate - $runspace.startTime $runMin = [math]::Round( $runtime.totalminutes ,2 )

#set up log object $log = "" | select Date, Action, Runtime, Status, Details $log.Action = "Removing:'$($runspace.object)'" $log.Date = $currentdate $log.Runtime = "$runMin minutes"

#If runspace completed, end invoke, dispose, recycle, counter++ If ($runspace.Runspace.isCompleted) {

$script:completedCount++

#check if there were errors if($runspace.powershell.Streams.Error.Count -gt 0) {

#set the logging info and move the file to completed $log.status = "CompletedWithErrors" Write-Verbose ($log | ConvertTo-Csv -Delimiter ";" -NoTypeInformation)[1] foreach($ErrorRecord in $runspace.powershell.Streams.Error) { Write-Error -ErrorRecord $ErrorRecord } } else {

#add logging details and cleanup $log.status = "Completed" Write-Verbose ($log | ConvertTo-Csv -Delimiter ";" -NoTypeInformation)[1] }

#everything is logged, clean up the runspace $runspace.powershell.EndInvoke($runspace.Runspace) $runspace.powershell.dispose() $runspace.Runspace = $null $runspace.powershell = $null

}

#If runtime exceeds max, dispose the runspace ElseIf ( $runspaceTimeout -ne 0 -and $runtime.totalseconds -gt $runspaceTimeout) {

$script:completedCount++ $timedOutTasks = $true

#add logging details and cleanup $log.status = "TimedOut" Write-Verbose ($log | ConvertTo-Csv -Delimiter ";" -NoTypeInformation)[1] Write-Error "Runspace timed out at $($runtime.totalseconds) seconds for the object:`n$($runspace.object | out-string)"

#Depending on how it hangs, we could still get stuck here as dispose calls a synchronous method on the powershell instance if (!$noCloseOnTimeout) { $runspace.powershell.dispose() } $runspace.Runspace = $null $runspace.powershell = $null $completedCount++

}

#If runspace isn't null set more to true ElseIf ($runspace.Runspace -ne $null ) { $log = $null $more = $true }

#log the results if a log file was indicated if($logFile -and $log){ ($log | ConvertTo-Csv -Delimiter ";" -NoTypeInformation)[1] | out-file $LogFile -append } }

#Clean out unused runspace jobs $temphash = $runspaces.clone() $temphash | Where { $_.runspace -eq $Null } | ForEach { $Runspaces.remove($_) }

#sleep for a bit if we will loop again if($PSBoundParameters['Wait']){ Start-Sleep -milliseconds $SleepTimer }

#Loop again only if -wait parameter and there are more runspaces to process } while ($more -and $PSBoundParameters['Wait'])

#End of runspace function }

#endregion functions

#region Init

if($PSCmdlet.ParameterSetName -eq 'ScriptFile') { $ScriptBlock = [scriptblock]::Create( $(Get-Content $ScriptFile | out-string) ) } elseif($PSCmdlet.ParameterSetName -eq 'ScriptBlock') { #Start building parameter names for the param block [string[]]$ParamsToAdd = '$_' if( $PSBoundParameters.ContainsKey('Parameter') ) { $ParamsToAdd += '$Parameter' }

$UsingVariableData = $Null

# This code enables $Using support through the AST. # This is entirely from Boe Prox, and his https://github.com/proxb/PoshRSJob module; all credit to Boe!

if($PSVersionTable.PSVersion.Major -gt 2) { #Extract using references $UsingVariables = $ScriptBlock.ast.FindAll({$args[0] -is [System.Management.Automation.Language.UsingExpressionAst]},$True)

If ($UsingVariables) { $List = New-Object 'System.Collections.Generic.List`1[System.Management.Automation.Language.VariableExpressionAst]' ForEach ($Ast in $UsingVariables) { [void]$list.Add($Ast.SubExpression) }

$UsingVar = $UsingVariables | Group SubExpression | ForEach {$_.Group | Select -First 1}

#Extract the name, value, and create replacements for each $UsingVariableData = ForEach ($Var in $UsingVar) { Try { $Value = Get-Variable -Name $Var.SubExpression.VariablePath.UserPath -ErrorAction Stop [pscustomobject]@{ Name = $Var.SubExpression.Extent.Text Value = $Value.Value NewName = ('$__using_{0}' -f $Var.SubExpression.VariablePath.UserPath) NewVarName = ('__using_{0}' -f $Var.SubExpression.VariablePath.UserPath) } } Catch { Write-Error "$($Var.SubExpression.Extent.Text) is not a valid Using: variable!" } } $ParamsToAdd += $UsingVariableData | Select -ExpandProperty NewName -Unique

$NewParams = $UsingVariableData.NewName -join ', ' $Tuple = [Tuple]::Create($list, $NewParams) $bindingFlags = [Reflection.BindingFlags]"Default,NonPublic,Instance" $GetWithInputHandlingForInvokeCommandImpl = ($ScriptBlock.ast.gettype().GetMethod('GetWithInputHandlingForInvokeCommandImpl',$bindingFlags))

$StringScriptBlock = $GetWithInputHandlingForInvokeCommandImpl.Invoke($ScriptBlock.ast,@($Tuple))

$ScriptBlock = [scriptblock]::Create($StringScriptBlock)

Write-Verbose $StringScriptBlock } }

$ScriptBlock = $ExecutionContext.InvokeCommand.NewScriptBlock("param($($ParamsToAdd -Join ", "))`r`n" + $Scriptblock.ToString()) } else { Throw "Must provide ScriptBlock or ScriptFile"; Break }

Write-Debug "`$ScriptBlock: $($ScriptBlock | Out-String)" Write-Verbose "Creating runspace pool and session states"

#If specified, add variables and modules/snapins to session state $sessionstate = [System.Management.Automation.Runspaces.InitialSessionState]::CreateDefault() if ($ImportVariables) { if($UserVariables.count -gt 0) { foreach($Variable in $UserVariables) { $sessionstate.Variables.Add( (New-Object -TypeName System.Management.Automation.Runspaces.SessionStateVariableEntry -ArgumentList $Variable.Name, $Variable.Value, $null) ) } } } if ($ImportModules) { if($UserModules.count -gt 0) { foreach($ModulePath in $UserModules) { $sessionstate.ImportPSModule($ModulePath) } } if($UserSnapins.count -gt 0) { foreach($PSSnapin in $UserSnapins) { [void]$sessionstate.ImportPSSnapIn($PSSnapin, [ref]$null) } } }

#Create runspace pool $runspacepool = [runspacefactory]::CreateRunspacePool(1, $Throttle, $sessionstate, $Host) $runspacepool.Open()

Write-Verbose "Creating empty collection to hold runspace jobs" $Script:runspaces = New-Object System.Collections.ArrayList

#If inputObject is bound get a total count and set bound to true $bound = $PSBoundParameters.keys -contains "InputObject" if(-not $bound) { [System.Collections.ArrayList]$allObjects = @() }

#Set up log file if specified if( $LogFile ){ New-Item -ItemType file -path $logFile -force | Out-Null ("" | Select Date, Action, Runtime, Status, Details | ConvertTo-Csv -NoTypeInformation -Delimiter ";")[0] | Out-File $LogFile }

#write initial log entry $log = "" | Select Date, Action, Runtime, Status, Details $log.Date = Get-Date $log.Action = "Batch processing started" $log.Runtime = $null $log.Status = "Started" $log.Details = $null if($logFile) { ($log | convertto-csv -Delimiter ";" -NoTypeInformation)[1] | Out-File $LogFile -Append }

$timedOutTasks = $false

#endregion INIT }

Process {

#add piped objects to all objects or set all objects to bound input object parameter if($bound) { $allObjects = $InputObject } Else { [void]$allObjects.add( $InputObject ) } }

End {

#Use Try/Finally to catch Ctrl+C and clean up. Try { #counts for progress $totalCount = $allObjects.count $script:completedCount = 0 $startedCount = 0

foreach($object in $allObjects){

#region add scripts to runspace pool

#Create the powershell instance, set verbose if needed, supply the scriptblock and parameters $powershell = [powershell]::Create()

if ($VerbosePreference -eq 'Continue') { [void]$PowerShell.AddScript({$VerbosePreference = 'Continue'}) }

[void]$PowerShell.AddScript($ScriptBlock).AddArgument($object)

if ($parameter) { [void]$PowerShell.AddArgument($parameter) }

# $Using support from Boe Prox if ($UsingVariableData) { Foreach($UsingVariable in $UsingVariableData) { Write-Verbose "Adding $($UsingVariable.Name) with value: $($UsingVariable.Value)" [void]$PowerShell.AddArgument($UsingVariable.Value) } }

#Add the runspace into the powershell instance $powershell.RunspacePool = $runspacepool

#Create a temporary collection for each runspace $temp = "" | Select-Object PowerShell, StartTime, object, Runspace $temp.PowerShell = $powershell $temp.StartTime = Get-Date $temp.object = $object

#Save the handle output when calling BeginInvoke() that will be used later to end the runspace $temp.Runspace = $powershell.BeginInvoke() $startedCount++

#Add the temp tracking info to $runspaces collection Write-Verbose ( "Adding {0} to collection at {1}" -f $temp.object, $temp.starttime.tostring() ) $runspaces.Add($temp) | Out-Null

#loop through existing runspaces one time Get-RunspaceData

#If we have more running than max queue (used to control timeout accuracy) #Script scope resolves odd PowerShell 2 issue $firstRun = $true while ($runspaces.count -ge $Script:MaxQueue) {

#give verbose output if($firstRun){ Write-Verbose "$($runspaces.count) items running - exceeded $Script:MaxQueue limit." } $firstRun = $false

#run get-runspace data and sleep for a short while Get-RunspaceData Start-Sleep -Milliseconds $sleepTimer

}

#endregion add scripts to runspace pool }

Write-Verbose ( "Finish processing the remaining runspace jobs: {0}" -f ( @($runspaces | Where {$_.Runspace -ne $Null}).Count) ) Get-RunspaceData -wait

if (-not $quiet) { Write-Progress -Activity "Running Query" -Status "Starting threads" -Completed } } Finally { #Close the runspace pool, unless we specified no close on timeout and something timed out if ( ($timedOutTasks -eq $false) -or ( ($timedOutTasks -eq $true) -and ($noCloseOnTimeout -eq $false) ) ) { Write-Verbose "Closing the runspace pool" $runspacepool.close() }

#collect garbage [gc]::Collect() } } }

$StorageAccountName = $StorageAccount.ToLower()

# see if we already have a session. If we don't don't re-authN if (!$AzureRMAccount.Context.Tenant) { $AzureRMAccount = Add-AzureRmAccount }

$SubscriptionName = Get-AzureRmSubscription | sort SubscriptionName | Select SubscriptionName $TenantId = $AzureRMAccount.Context.Tenant.TenantId

Select-AzureRmSubscription -TenantId $TenantId write-host "Enumerating VM's from AzureRM in Resource Group '" $ResourceGroup "' from '" $StorageAccountName "'"

$StorageVMs = get-azurermvm | where {$_.storageprofile.osdisk.vhd.uri -like "*$StorageAccountName.$storageSuffix*"} $vmrunninglist = @() $vmstoppedlist = @()

Foreach($vmonstore in $StorageVMs) { $vmstatus = Get-AzureRMVM -ResourceGroupName $ResourceGroup -name $vmonstore.name -Status $PowerState = (get-culture).TextInfo.ToTitleCase(($vmstatus.statuses)[1].code.split("/")[1])

write-host "VM: '"$vmonstore.Name"' is" $PowerState if ($Powerstate -eq 'Running') { $vmrunninglist = $vmrunninglist + $vmonstore.name } if ($Powerstate -eq 'Deallocated') { $vmstoppedlist = $vmstoppedlist + $vmonstore.name } }

if ($Power -eq 'start') { write-host "Starting VM's "$vmstoppedlist " in Resource Group "$ResourceGroup $vmstoppedlist | Invoke-Parallel -ImportVariables -NoCloseOnTimeout -ScriptBlock { Start-AzureRMVM -ResourceGroupName $ResourceGroup -Name $_ -Verbose } }

if ($Power -eq 'stop') { write-host "Stopping VM's "$vmrunninglist " in Resource Group "$ResourceGroup $vmrunninglist | Invoke-Parallel -ImportVariables -NoCloseOnTimeout -ScriptBlock { Stop-AzureRMVM -ResourceGroupName $ResourceGroup -Name $_ -Verbose -Force } } [/powershell]

Moving VHDs from one Storage Account to Another (Part 1)

powershell.jpg

We are often asked to review a customer's Azure environment. Part of that includes the review of specific applications for stability, availability, scalability and overall design. As customers move to Azure, they have to forget some of the best practices they've implemented in their on-premise environments. The cloud is different. Many customer's first experience with Azure is an IaaS experience. Even there, things are different. Features like Availability Sets, Storage Accounts, Upgrade and Fault Domains come into play. Decisions with these features can have long lasting effects on their applications and environments. One relatively common scenario we see is that while deploying systems for an application, the Storage Account is normally ignored and all VMs of a specific role (or even all instances of an application) are placed in a single Storage Account. In fact, we see Availability Sets and Storage Accounts map 1 to 1 fairly regularly. We wont go into some of the issues that you may run into with a single Storage Account (that's a whole other blog post), but will focus on the fact that it can become a single point of failure. If all the VMs of a specific role reside in a single Storage Account and something happens to that Storage Account, no more VMs. Availability Set be damned. Your VMs (and application) are offline.

Lets see how you can use PowerShell to move VHDs (OS and/or Data) from one Storage Account to another.  Let's get started...

Once you have your PowerShell console up, you will have to log into Azure:

[powershell]Login-AzureRmAccount[/powershell]

Note that this may fail if you have never used your computer to connect to Azure using PowerShell. If that is the case, you will need to download and import the PublishSettings file form Windows Azure and import it into your computer. Instructions on how to do this are found here.

At this point, you will be presented with an Azure login dialog, enter in your credentials and your PowerShell session will be connected to and authenticated against Azure.

Next, you'll need the Resource Group that the VM (and therefore VHD) resides in. You can find that in the Azure Portal by simply navigating to the Virtual Machines Blade:

ResourceGroupViaVMBlade

Let's store that in a variable:

[powershell]$RGname = "Default-Web-WestUS"[/powershell]

We'll also need the name of the VM:

[powershell]$vmname = "VM1"[/powershell]

Before we can move the VHD to the second Storage Account, the VM has to be turned off (can't move the disk if its being accessed). As you can see in the screenshot above, my VM is already stopped. The following command will stop the VM if it happens to be running:

[powershell]get-azurermvm -name $vmname -ResourceGroupName $RGname | stop-azurermvm[/powershell]

The last VM specific property we need is the name of the VHD. This can be found in the Storage Account Blade (Storage Accounts -> sourceStorageAccount -> Blobs -> ContainerName):

StorageAccounts

Blobs

Blob

Notice that the Container name is vhds, which is the default name when creating your first VM in the Storage Account. Yours might be different, so make a note of it.

Let's store the name of the VHD in a variable:

[powershell]$vhdName = "VM12016621122342.vhd"[/powershell]

Next, We'll need some information about the source and destination Storage Accounts:

  • Names of the Storage Accounts
  • Access Keys for the Storage Accounts
  • Names of the Containers storing the VHDs

The names of the Storage Accounts and the names of the Containers were shown in the previous screenshots. The Access Keys for the Storage Accounts are stored in the Settings blade for the Storage Account:

AccessKeys

Note: Keep these keys secret as they can grant anyone access to your Storage Account. Microsoft recommends that you keep one key for connection, and regenerate the other. Be aware that if you regenerate the keys, any application that is connecting to this Storage Account will need to be updated with the newly regenerated key.

Now that we have this information, we can set up the next group of variables.  For the source Storage Account:

[powershell]$sourceSAName = "storageaccountmigration1"

$sourceSAKey = "InsertKeyHere"

$sourceSAContainerName = "vhds" [/powershell]

And for the destination Storage Account:

[powershell]$destinationSAName = "storageaccountmigration2"

$destinationSAKey = "InsertKeyHere"

$destinationContainerName = "vhds" [/powershell]

A Storage context for each of the Storage Accounts is needed. This Storage Context will be used when copying the VHDs between the 2 Blob Storage locations (more information about the Storage Context commands can be found here):

[powershell]$sourceContext = New-AzureStorageContext -StorageAccountName $sourceSAName -StorageAccountKey $sourceSAKey

$destinationContext = New-AzureStorageContext –StorageAccountName $destinationSAName -StorageAccountKey $destinationSAKey [/powershell]

Note: This script assumes that the destination container has already been created. If you need to create this container, you can either use the Portal, or the following PowerShell command:

[powershell]$destinationContainerName = "destinationvhds"

New-AzureStorageContainer -Name $destinationContainerName -Context $destinationContext[/powershell]

The final step is to start the blob copy. For this, we use the Start-AzureStorageBlobCopy command.

[powershell]$blobCopy = Start-AzureStorageBlobCopy -DestContainer $destinationContainerName -DestContext $destinationContext -SrcBlob $vhdName -Context $sourceContext -SrcContainer $sourceSAContainerName [/powershell]

You will notice that executing this cmdlet will return the prompt after a few seconds. The blob copy is not actually done, it's just running in the background.  To following command will show you its current status:

[powershell]

($blobCopy | Get-AzureStorageBlobCopyState).Status [/powershell]

The status will be Pending until the copy is completed. Alternately, running the following will run in a loop and show you the actual progress of the copy (refreshed every 10 seconds):

[powershell]$TotalBytes = ($blobCopy | Get-AzureStorageBlobCopyState).TotalBytes

cls

while(($blobCopy | Get-AzureStorageBlobCopyState).Status -eq "Pending")

{

Start-Sleep 1

$BytesCopied = ($blobCopy | Get-AzureStorageBlobCopyState).BytesCopied

$PercentCopied = [math]::Round($BytesCopied/$TotalBytes * 100,2)

Write-Progress -Activity "Blob Copy in Progress" -Status "$PercentCopied% Complete:" -PercentComplete $PercentCopied

}[/powershell]

Progress

Once the copy is completed, the dialog box will disappear and you will be returned to the PowerShell prompt.

And the full script:

[powershell]# Login to Azure</pre>

Login-AzureRmAccount

# Set Resource Group Name

$RGname= "Default-Web-WestUS"

# Set VM Name

$vmname = "VM1"

# Stop VM

get-azurermvm -name $vmname -ResourceGroupName $RGname | stop-azurermvm

# Set name of VHD blob to copy

$vhdName = "VM12016621122342.vhd"

# Source Storage Account Information

$sourceSAName = "storageaccountmigration1"

$sourceSAKey = "1UzJEeop8MW/jOE5eX9ejilO1x6gwxxcMGIVdO36uchtwL128h3LzGQAt1CpFxs03E5FlGveCNkwhpvxQTCTTA=="

$sourceSAContainerName = "vhds"

# Destination Storage Account Information

$destinationSAName = "storageaccountmigration2"

$destinationSAKey = "dN6rMnqeUxkBkzpeOLS5wns6UJcL2zjGIj7cTGZ8if0ZNumyvrdDytW9LuiW6Qc/knkeoeTg+ejrFrHsmqzb4w=="

$destinationContainerName = "vhds"

# Source Storage Account Context

$sourceContext = New-AzureStorageContext -StorageAccountName $sourceSAName -StorageAccountKey $sourceSAKey

# Destination Storage Account Context

$destinationContext = New-AzureStorageContext –StorageAccountName $destinationSAName -StorageAccountKey $destinationSAKey

# Copy the blob

$blobCopy = Start-AzureStorageBlobCopy -DestContainer $destinationContainerName -DestContext $destinationContext -SrcBlob $vhdName -Context $sourceContext -SrcContainer $sourceSAContainerName [/powershell]

Note that you can also use the AzCopy command to perform some of these actions.

We've shown you how straightforward it is to move your VHDs from one Storage Account to another. In part 2 (coming soon) we will look at two things; first, we'll automate the script to remove the hardcoded variables and make it simpler to select each of the properties needed for the VHD move. Second we'll look at actually creating the new VM with the disks in the new Storage Account.

Exporting Azure Resource Manager VM properties with PowerShell

powershell.jpg

On a recent project, I needed a list of all the VMs running in a subscription with some of each VMs properties. We had an Excel Spreadsheet with all the VMs and properties, but going through that was a real pain.  So, I wrote a basic PowerShell script to collect the information I needed and figured I would share it. The script is pretty straightforward (full script is at the end of the post) and does the following:

  • Logs into Azure
  • Gets all the Virtual Machines
  • Gets specific properties of each VM
  • Generates a Grid View with all the selected properties

Logging into Azure

Logging into Azure from PowerShell is a simple command:

[powershell]Login-AzureRmAccount[/powershell]

Note that this may fail if you have never used your computer to connect to Azure using PowerShell. If that is the case, you will need to download and import the PublishSettings file form Windows Azure and import it into your computer. Instructions on how to do this are found here.

Once that command is executed, you'll be presented with the Azure Login page (as shown below). Simply log into Azure using your Azure credentials and your PowerShell session will be authenticated with your Azure account.

AzureLoginPage

If the login is successful, you'll get something similar to the screenshot below:

SuccessfulLogin

Get Virtual Machines

Next, let's get all the VMs in our subscription using the Get-AzureRMVM command. Once you run that command, you'll see a ton of information scroll by on the screen, to make it simpler, lets store the output in a variable:

[powershell]$RMVMs=Get-AzurermVM[/powershell]

We'll use this object in the next section...

Get VM Properties

One of the most important properties we'll use is the name of the VM. If you look at the text that flew by earlier (or type in $RMVMs to see it again), you'll notice all the properties that are available to you. One of them, is Name:

VMName

We can display just the names of the VMs in the $RMVMs object by appending .Name on the end:

[powershell]$RMVMs.Name[/powershell]

The output will return just the names:

[powershell]VM1

VM2[/powershell]

Nested properties, such as OSType are just as straightforward to get:

[powershell]$RMVMs.storageprofile.osdisk.ostype[/powershell]

Will return:

[powershell]Windows

Windows[/powershell]

Feel free to look through the properties available.  Next, we will display it in the Grid View.

Display in a Grid View

For the Grid View, we need an array with all the properties that we want to collect.  First, we need an empty array that we'll call $RMVMArray:

[powershell]$RMVMArray = @()[/powershell]

Next, we'll loop through each of the VMs in the RMVMs object:

[powershell]foreach ($vm in $RMVMs) { ... }[/powershell]

And add some properties we want to see:

[powershell] foreach ($vm in $RMVMs) { # Generate Array $RMVMArray += New-Object PSObject -Property @{

Name = $vm.Name; Location = $vm.Location; OSType = $vm.StorageProfile.OsDisk.OsType; } }[/powershell]

Finally, we can display the Grid View:

[powershell]$RMVMArray | Out-Gridview[/powershell]

You'll get a Grid View similar to the one below:

GridView

Pulling it all together

Now that we have each of the pieces, lets pull it into a script that we can simply run every time we want to get a list of the VMs in our Subscription and their properties. In my case, I called the script GetRMVMProperties.ps1.  And the full code:

[powershell]# Log in to Azure Login-AzureRmAccount

# Make sure there is at least one VM in the Subscription ($RMVMs=Get-AzurermVM) &amp;gt; 0

# Create array to contain all the VMs in the subscription $RMVMArray = @()

# Loop through VMs foreach ($vm in $RMVMs) { # Get VM Status (for Power State) $vmStatus = Get-AzurermVM -Name $vm.Name -ResourceGroupName $vm.ResourceGroupName -Status

# Generate Array $RMVMArray += New-Object PSObject -Property @{`

# Collect Properties Name = $vm.Name; PowerState = (get-culture).TextInfo.ToTitleCase(($vmStatus.statuses)[1].code.split("/")[1]); Location = $vm.Location; Tags = $vm.Tags Size = $vm.HardwareProfile.VmSize; ImageSKU = $vm.StorageProfile.ImageReference.Sku; OSType = $vm.StorageProfile.OsDisk.OsType; OSDiskSizeGB = $vm.StorageProfile.OsDisk.DiskSizeGB; DataDiskCount = $vm.StorageProfile.DataDisks.Count; DataDisks = $vm.StorageProfile.DataDisks; } }

# Gridview output $title = "VMs in the '{0}' Subscription" -f $Subscriptions.SubscriptionName $RMVMArray | Sort-Object -Property Name | Out-Gridview -Title $title[/powershell]

Next Steps

It actually took considerably longer to write this blog post than to write the script. The customer that I wrote this script for has multiple subscriptions, so that's the next step.... Modify the script to automatically step through each subscription and generate a Grid View for each.

The Concept of Cloud Adjacency

datacenter.jpg

Several years ago, when we first started theorizing around hybrid cloud, it was clear that to do our definition of hybrid cloud properly, you needed to have a fast, low latency connection between your private cloud and the public clouds you where using. After talking to major enterprise customers we theorized five cases where this type of arrangement made sense, and actually accelerated cloud adoption.

  1. Moving workloads but not data - If you expected to move workloads between cloud providers, and between public and private to take advantage of capacity and price moves, moving the workloads quickly and efficiently meant, in some cases, not moving the data.
  2. Regulatory and Compliance Reasons - For regulatory and compliance reasons, say the data is sensitive and must remain under corporate physical ownership wile at rest, but you still want to use the cloud for apps around that data (including possible front ending it).
  3. Application Support - A case where you have a core application that can’t move to cloud due to support requirements (EPIC Health is an example of this type of application), but you have other surround applications that can move to cloud, but need a fast, low latency connection back to this main application.
  4. Technical Compatibility  - A case where there is a technical requirement, say you need real time storage replication to a DR site, or very high IOPS, and Azure can’t for one reason or another handle the scenario, in this case that data can be placed on a traditional storage infrastructure and presented to the cloud over the high bandwidth low latency connection.
  5. Cost – There are some workloads which today are significantly more expensive to run in cloud then on existing large on premise servers. These are mostly very large compute platforms that scale up rather then out. Especially in cases where a customer already owns this type of equipment and it isn't fully depreciated. Again, it may make sense to run surround systems in public cloud that run on commodity two socket boxes, wile retaining the large 4 socket, high memory instances until fully depreciated.

 

All of these scenarios require a low latency high bandwidth connection to the cloud, as well as a solid connection back to the corporate network. In these cases you have a couple options for getting this type of connection.

  1. Place the equipment into your own datacenter, and buy high bandwidth Azure express route and if needed Amazon Direct Connect connections from an established Carrier like AT&T, Level3, Verizon, etc.
  2. Place the equipment into a carrier colocation facility and create a high bandwidth connection from there back to your premise. This places the equipment closer to the cloud but introduces latency between you and the equipment. This works because most applications assume latency between the user and the application, but are far less tolerant of latency between the application tier and the database tier or between the database and its underlying storage. Additionally you can place a traditional WAN Accelerator (Riverbed or the like) on the link between the colocation facility and your premise.

 

For simply connecting your network to Azure, both options work equally well. If you have one of the 5 scenarios above however, the later options (colocation) is better. Depending on the colocation provider, the latency will be low to very low (2-40ms depending on location). All of the major carriers offer a colocation arrangement that is close to the cloud edge. At Avanade, I run a fairly large hybrid cloud lab environment, which allows us to perform real world testing and demonstrations of these concepts. In our case we’ve chosen to host the lab in one of these colocation providers, Equinix. Equinix has an interesting story in that they are a carrier neutral facility. In other words, they run the datacenter and provide a brokerage (The Cloud Exchange), but don’t actually sell private line circuits. You connect thru their brokerage directly to the other party. This is interesting because it means I don’t pay for a 10gig local loop fee, i simply pay for access to the exchange. Right now I buy 2 10gig ports to the exchange, and over those I have 10gigs of express route and 10 gigs of Amazon Direct connect delivered to my edge routers. The latency to a VM running within my private cloud in the lab and a VM in Amazon or azure is sub 2ms. This is incredibly fast. Fast enough to allow me to showcase each of the above scenarios.

 

We routinely move workloads between providers without moving the underlying data by presenting the data disks over iscsi from our Netapp FAS to the VM running at the cloud provider. When we move the VM, we migrate the OS and system state, but the data disk is simply detached and reattached at the new location. This is possible due to the low latency connection and is a capability that Netapp sells as Netapp Private Storage or NPS. This capability is also useful for the regulatory story of keeping data at rest under our control. The data doesn't live in the cloud provider and we can physically pinpoint its location at all times. Further this meets some of the technical compatibility scenario. Because my data is back ended on a capable SAN with proper SAN based replication and performance characteristics, I can run workloads that I may not otherwise have been able to run in cloud due to IOPS, cost for performance or feature challenges with cloud storage.

 

Second, I have compute located in this adjacent space, some of this compute are very large multi TB of RAM, quad socket machines running very large line of business applications. In this case those LOB applications have considerable surround applications that can run just fine on public cloud compute, but this big one isn’t cost effective to run on public cloud since I've already invested in this hardware. Or perhaps I’m not allowed to move it to public due to support constraints. Another example of this is non x86 based platforms. Lets say I have an IBM POWER or mainframe platform. I don’t want to retire it or re platform it due to cost, but I have a number of x86 based surround applications that I’d like to move to cloud. I can place my mainframe within the cloud adjacent space, and access those resources as if they too were running in the public cloud.

 

As you can see, cloud adjacent space opens up a number of truly hybrid scenarios. We’re big supporters of the concept when it makes sense, and wile there are complexities, it can easily unlock move additional workloads to public cloud.

Writing your own custom data collection for OMS using SCOM Management Packs

OMS-Cover-Photo.png

[Unsupported] OMS is a great data collection and analytics tool, but at the moment it also has some limitations. Microsoft has been releasing Integration Pack after Integration Pack and adding features at a decent pace, but unlike the equivalent SCOM Management Packs the IPs are somewhat of a black box. Awhile back, I got frustrated with some of the lack of configuration options in OMS (then Op Insights) and decided to “can opener” some of the features. I figured the best way to do this was to examine some of the management packs that OMS would install into a connected SCOM Management Group and start digging through their contents. Now granted, lately the OMS team has added a lot more customization options than they used to have when I originally traced this out. You can now add custom performance collection rules, custom log collections, and more all from within the OMS portal itself. However, there are still several advantages to being able create your own OMS collection rules in SCOM directly. These include:

  • Additional levels of configuration customization beyond what’s offered in the OMS portal, such as the ability to target any class you want or use more granular filter criteria than is offered by the portal.

  • The ability to migrate your collection configuration from one SCOM instance to another. OMS doesn’t currently allow you to export custom configuration.

  • The ability to do bulk edits through the myriad of tools and editors available for SCOM management packs (it’s much easier to add 50 collection rules to an existing SCOM MP using a simple RegEx find/replace than it is to hand enter them into the OMS SaaS portal).

Digging into the Management Packs automatically loaded into a connected SCOM instance from OMS, we find that there are quite a few. A lot of them still bear the old “System Center Advisor” filenames and display strings from before Advisor got absorbed into OMS, but the IPs also add in a bunch of new MPs that include “IntelligencePacks” in the IDs making them easier to filter by. Many of the type definitions are  stored in a pack named (unsurprisingly) Microsoft.IntelligencePacks.Types.

Here is the Event Collection Write Action Module Type Definition:

Event Collector Type

Now let’s take a look at one of the custom event collection rules I created in the OMS portal to grab all Application log events. These rules are contained in the Microsoft.IntelligencePack.LogManagement.Collection MP:

Application Log Collection Rule

We can see that the event collection rule for OMS looks an awful lot like a normal event collection rule in SCOM. The ID is automatically generated according to a naming convention that OMS keeps track of  which is Microsoft.IntelligencePack.LogManagement.Collection.EventLog. followed by a unique ID string to identify to OMS each specific rule. The only real difference between this rule and a standard event collection rule is the write action, which is the custom write action we saw defined in the Types pack that’s designed to write the event to OMS instead of to the SCOM Data Warehouse. So all you need to do to create your own custom event data collection rule in SCOM is add a reference to the Type MP to your custom MP like:

<Reference Alias="IPTypes"> <ID>Microsoft.IntelligencePacks.Types</ID> <Version>7.0.9014.0</Version> <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> </Reference>

…and then either replace the write action in a standard event collection rule with the following, or add it as an additional write action (you can actually collect to both databases using a single rule):

<WriteAction ID="HttpWA" TypeID="IPTypes!Microsoft.SystemCenter.CollectCloudGenericEvent" />

Now granted, OMS lets you create your own custom event log collection rules using the OMS portal, but at the moment the level of customization and filtering available in the OMS portal is pretty limited. You can specify the name of the log and you can select any combination of three event levels (ERROR, WARNING, and INFOMRATION). You can modify the collection rules to filter them down based on any additional criteria you can create using a standard SCOM management pack. In large enterprises, this can help keep your OMS consumption costs down by leaving out events that you know you do not need to collect.

If we look at Performance Data next, we see that there are two custom Write Action types that are of interest to us:

Perf Data Collector Type

…and…

Perf Agg Data Collector Type

…which are the Write Action Modules used for collecting custom performance data and the aggregates for that performance data, respectively. If we take a look at some of the performance collection rules that use these types, we can see how we can use them ourselves. Surprisingly, in the current iteration of OMS they get stored in the Microsoft.IntelligencePack.LogManagement.Collection MP along with the event log collection rules. Here’s an example of the normal collection rule generated in SCOM by adding a rule in OMS:

Perf Data Collector Rule

And just like what we saw with the Event Collection rules, the only difference between this rule and a normal SCOM Performance Collection rule is that instead of the write action to write the data to the Operations Manager DB or DW, we have a “write to cloud” Write Action. So all we need to do in order to add OMS performance collection to existing performance rules is add a reference to the Types MP:

<Reference Alias="IPTypes"> <ID>Microsoft.IntelligencePacks.Types</ID> <Version>7.0.9014.0</Version> <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> </Reference>

And then add the custom write action to the Write Actions section of any of our existing collection rules. Like with the Event Collection rules, we can use multiple write actions so a single rule is capable of writing to Operations Manager database, the data warehouse, and OMS.

<WriteAction ID="WriteToCloud" TypeID="IPTypes!Microsoft.SystemCenter.CollectCloudPerformanceData_PerfIP" />

Now in addition to the standard collection rule we also have an aggregate collection rule that looks like this:

App Log Agg Collection Rule

This rule looks almost exactly like the previous collection rule, except for two big differences. One, is that it uses a different write action:

<ConditionDetection ID="Aggregator" TypeID="IPPerfCollection!Microsoft.IntelligencePacks.Performance.PerformanceAggregator">

…and there is an additional Condition Detection that wasn’t present in the standard collection rule for the aggregation:

<ConditionDetection ID="Aggregator" TypeID="IPPerfCollection!Microsoft.IntelligencePacks.Performance.PerformanceAggregator"> <AggregationIntervalInMinutes>30</AggregationIntervalInMinutes> <SamplingIntervalSeconds>10</SamplingIntervalSeconds> </ConditionDetection>

Changing the value for AggregationIntervalInMinutes allows you to change the aggregation interval, which is something that you cannot do in the OMS portal. Otherwise, the native Custom Performance Collection feature of OMS is pretty flexible and allows you to use any Object/Counter/Instance combination you want. However, if your organization already uses SCOM there’s a good chance that you already have a set of custom SCOM MPs that you use for performance data collection. Adding a single write action to these existing rules and creating an additional optional aggregation rule for 100 pre-existing rules is likely easier for an experienced SCOM author than hand-entering 100 custom performance collection rules into the OMS portal. The other benefits from doing it this way include the ability to bulk edit (changing the threshold for all the counters, for example, would be a simple find/replace instead manually changing each rule) and the ability to export this configuration. OMS lets you export data, but not configuration. Any time you spend hand-entering data into the OMS portal would have to be repeated for any other OMS workspace you want to use that configuration in. A custom SCOM MP, however, can be put into source control and re-used in as many different environments as you like.

Note: When making modifications to any rules, do not make changes to the unsealed OMS managed MPs in SCOM. While these changes probably won’t break anything, OMS is the source of record for the content of those MPs. If you make a change in SCOM, it will be disconnected from the OMS config and will likely be overwritten the next time you make a change in OMS.

One last thing. Observant readers may have noticed that every rule I posted is disabled by default. OMS does this for every custom rule, and then enables the rules through the use of an override that’s contained within the same unsealed management pack so this is normal. This is presumably because adjusting an override to enable/disable something is generally considered a “lighter” touch than editing a rule directly, although I don’t see any options to disable any of the collection rules (only delete them).