Adding Additional Nodes to Azure Stack

lego.jpg

Last week, I had the opportunity to add some extra capacity to a four-node appliance that I look after. Luckily, I got to double the capacity, so making it an eight-node scale unit. This post documents my experience and fills in the gaps that the official documentation doesn’t tell you 😊 From a high-level perspective, these are the activities that take place:

  • Additional nodes/servers are racked and stacked in the same rack as the existing scale unit.

  • The Node BMC interface is configured with an IP address and Gateway server within the management network, along with the correct username/password (the same as the existing nodes in the cluster).

  • The Azure Stack OEM configures the BMC and Top of Rack switch configurations to enable the additional switch ports for the additional nodes.

  • Azure Stack Operator adds each additional node to the Scale Unit (one at a time)

    • Compute resource becomes available first

    • S2D re-balances the cluster, once completed, the additional storage is made available

Easy huh?

Here’s a bit more detail into each of the steps:

Each of the additional nodes that are being added to the scale unit must be identical to the existing servers. This includes CPU, memory, storage capacity, hardware versions. The hardware must be installed as prescribed by the OEM and connected to the BMC and TOR switches to the correct ports. It is unclear whose responsibility this is, whether it is the OEM or the operator, so check beforehand when you purchase your additional nodes.

For Azure Stack to be able to add the additional nodes into the scale unit, the BMC interface must be configured so that the IP/subnet/Gateway are correctly configured, as well as the username and password that matches the existing nodes in the scale unit. This is critical as any misconfiguration will stop the node being added. As an example, assume that the management network is set to 10.0.0.0/27 and we have 4 existing nodes in our scale unit. 10.0.0.1 would be our Gateway address, 10.0.0.3 – 10.0.0.6 would be the IP address of nodes 1 – 4, so for our first additional node, we would use 10.0.0.7, incrementing from there up to a maximum of 16 nodes (10.0.0.18)

The network switches must be configured by the OEM. There is currently no provision for the additional configuration to be carried out via automation, and if the network switches were to be opened to an operator, this breaks the principle of Azure Stack being a blackbox appliance. The switches need reconfiguring to enable the additional ports on the BMC and two Top of Rack switches. Unused ports are purposely not enabled to keep the configuration as secure as possible.

Prior to attempting addition of the additional nodes, check in the Administrator Portal whether any existing FRU operations are taking place (e.g. rebuild of an existing node due to a hardware issue).

OK, so all the above has been carried out the Azure Stack operator can start to add in the additional nodes. From the Administrator portal:

Select Dashboard -> Region Management

Select Scale Units to open the blade:

We currently have a nice and healthy four-node cluster 😊. Select Add Node to open the configuration blade.

With the current release, as there is only one region and scale unit, there is only one option that we can select for the first two drop downs.

Enter the BMC IP address of the additional node and select OK

My first attempt didn't work as the BMC was incorrectly configured (The Gateway address for the BMC adapter was not set).

This is the error you will see if there is a problem:

Correcting the gateway address solved the problem. Here is what you’ll see when the scale unit is expanding:

After a few minutes, you will see the additional node being listed as a member of the s-cluster scale unit, albeit listed as Stopped.

Clicking on the new node will show the status as ‘Adding’, if you click on it from the blade.

If you prefer, you can check the status via PowerShell. From a system that has the Azure Stack PowerShell modules installed, connect to the Admin Endpoint environment and run:

 
#Retrieve Status for the Scale Unit
Get-AzsScaleUnit|select name,state

#Retrieve Status for each Scale Unit Node
Get-AzsScaleUnitNode |Select Name, ScaleUnitNodeStatus, PowerState

Whilst the expansion takes place, a critical alert fired. It's safe to ignore this.

Successfully completed node addition ill show the power status as running, plus the additional cores and memory available to the cluster :

It takes a little shy of 3 hours to complete the addition of a single cluster node:

Note: you can only add one extra node at a time, if you do, an error will be thrown as below:

I found that the first node added without a hitch, but subsequent nodes had some issues. I got error messages stating ‘Device not found’ on a couple of occasions. In hindsight, I guess that in the background, the cluster was performing some S2D operations and it caused some clashes for the newly added node. To fix this, I had to perform a ‘Repair’ on the new node. This invariably fixed the problem on the first attempt. If there was more information into what is actually happening under the hood, I could give a more qualified answer.

Eventually, All nodes were added 😊

Adding those additional nodes does not add additional storage to the Scale Unit until the S2D cluster has rebalanced. The only way you know that a rebalance is taking place is that the status of the scale unit shows as ‘expanding’, and will do for a long time after adding the additional node(s)!

Here’s how the Infrastructure File Shares blade looks like whist expansion is taking place:

Once expansion has completed, then the additional infrastructure file shares are created:

Unfortunately, there is no way to check the progress of the rebalance operation either in the portal or via PowerShell. The Privileged Endpoint does include the Get-StorageJob CMDlet, but this is useless unless the support session is unlocked. If it is unlocked, the following script could be used to check:


$ClusterName="s-cluster"

$jobs=(Get-StorageSubSystem -CimSession $ClusterName -FriendlyName Clus* | Get-StorageJob -CimSession $ClusterName)

if ($jobs){
	do{
		$jobs=(Get-StorageSubSystem -CimSession $ClusterName -FriendlyName Clus* | Get-StorageJob -CimSession $ClusterName)
		$count=($jobs | Measure-Object).count
		$BytesTotal=($jobs | Measure-Object BytesTotal -Sum).Sum
		$BytesProcessed=($jobs | Measure-Object BytesProcessed -Sum).Sum

		$percent=($jobs.PercentComplete)
		Write-output("$count Storage Job(s) Running. GBytes Processed: $($BytesProcessed/1GB) GBytes Total: $($BytesTotal/1GB) Percent: $($percent)% `r")

		Start-Sleep 10
	}until($jobs -eq $null)
}