Azure, PaaS, WebJobs Mike DeLuca Azure, PaaS, WebJobs Mike DeLuca

Troubleshooting automatic restart of Azure Web Jobs

Lately I was working on production issue where the web jobs hosted on azure were automatically restarting by itself or moving in stopped state. There was no signs of user manually restarting it (you can watch those via Activity Logs on your website). So the question is why was it happening? The answer lies in the web job logs. In order to understand it better, I have laid out mostly all reasons of automatic restart with sample logs (you can get the log from either Kudu web job dashboard or directly from storage account).

Reason 1: Due to website shutdown/restart

[10/24/2016 06:53:36 > b3e7a2: SYS INFO] WebJob is stopping due to website shutting down [10/24/2016 06:53:36 > b3e7a2: SYS INFO] Status changed to Stopping [10/24/2016 06:53:39 > b3e7a2: INFO] Job host stopped [10/24/2016 06:53:41 > b3e7a2: ERR ] Thread was being aborted. [10/24/2016 06:53:42 > b3e7a2: SYS INFO] WebJob process was aborted [10/24/2016 06:53:42 > b3e7a2: SYS INFO] Status changed to Stopped [10/24/2016 06:59:17 > 521cd3: SYS INFO] Status changed to Starting [10/24/2016 06:59:20 > 521cd3: SYS INFO] Run script '<YourWebJobName>.WebJob.exe' with script host - 'WindowsScriptHost' [10/24/2016 06:59:20 > 521cd3: SYS INFO] Status changed to Running

Reason 2: Due to changes in the azure web job directory files or file content (D:\home\site\wwwroot\app_data\jobs\continuous\<WebJobName>)

[10/29/2016 00:00:43 > 521cd3: SYS INFO] Detected WebJob file/s were updated, refreshing WebJob [10/29/2016 00:00:43 > 521cd3: SYS INFO] Status changed to Stopping [10/29/2016 00:00:47 > 8d7eea: SYS INFO] Detected WebJob file/s were updated, refreshing WebJob [10/29/2016 00:00:47 > 8d7eea: SYS INFO] Status changed to Stopping [10/29/2016 00:00:48 > 521cd3: INFO] Job host stopped [10/29/2016 00:00:49 > 521cd3: ERR ] Thread was being aborted. [10/29/2016 00:00:51 > 521cd3: SYS INFO] Status changed to Stopped [10/29/2016 00:00:51 > 521cd3: SYS INFO] Status changed to Starting [10/29/2016 00:00:52 > 521cd3: SYS INFO] WebJob process was aborted [10/29/2016 00:00:52 > 521cd3: SYS INFO] Status changed to Stopped [10/29/2016 00:00:52 > 521cd3: SYS INFO] Job directory change detected: Job file 'ApplicationInsights.config' timestamp differs between source and working directories. [10/29/2016 00:00:51 > 8d7eea: INFO] Job host stopped [10/29/2016 00:00:52 > 8d7eea: ERR ] Thread was being aborted. [10/29/2016 00:00:52 > 8d7eea: SYS INFO] WebJob process was aborted [10/29/2016 00:00:52 > 8d7eea: SYS INFO] Status changed to Stopped [10/29/2016 00:00:53 > 8d7eea: SYS INFO] Status changed to Starting [10/29/2016 00:00:54 > 8d7eea: SYS INFO] Job directory change detected: Job file 'ApplicationInsights.config' timestamp differs between source and working directories.[10/29/2016 00:01:01 > 521cd3: SYS INFO] Run script '<YourWebJobName>.WebJob.exe' with script host - 'WindowsScriptHost' [10/29/2016 00:01:01 > 521cd3: SYS INFO] Status changed to Running

Reason 3: Due to a web app and/or web job deployment

Easy to understand since it will trigger Scenario 2

Reason 4: Due to an azure outage or maintenance

Assuming you can rule out reason 1, 3 and 4 (straightforward too) by using standard azure web app monitoring tools (like Failure History and others), 2 would still require further analysis.

Basically Reason 2 indicates that if there is a change in the web job directory i.e. a new file/folder is added or removed automatically or manually, web jobs will restart. Or if the content of the file within the directory is changes, that would also initiate trigger in web job restart.

In that case need to dig further what triggered the directory and/or file content changes within web job directory. You need to ask below question:

  1. Was there any changes to web app settings via azure portal?
  2. Was there any runtime SDK upgrades? Like upgrading App Insights or installing an extension
  3. Are you creating any temporary files at runtime with the web job directory?

You might want to review the application design if 3 is correct.

This helped me fixing my production issue. Hope same for you.

Read More
ARM, Azure, Log Analytics, OMS Mike DeLuca ARM, Azure, Log Analytics, OMS Mike DeLuca

Automating OMS Searches, Schedules, and Alerts in Azure Resource Manager Templates

OMS-Icon.png

Recently, I had an opportunity to produce a proof of concept around leveraging Operations Management Suite (OMS) as a centralized point of monitoring, analytics, and alerting.  As in any good Azure project, codifying your Azure resources in an Azure Resource Manager template should be table stakes. While the official ARM schema doesn't yet have all the children for the Microsoft.OperationalInsights/workspaces type, the Log Analytics documentation does document samples for interesting use cases that demonstrate data sources and saved searches.  Notably absent from this list, however, is anything about the /schedules and /action types. Now, a clever Azure Developer will realize that an ARM template is basically an orchestration and expression wrapper around the REST APIs.  In fact, I've yet to see a case where an ARM template supports a type that's not provided by those APIs.  With that knowledge and the alert API samples, we can adapt the armclient put <url> examples to an ARM template.

The Resources

What you see as an alert in the OMS portal is actually 2 to 3 distinct resources in the ARM model:

A mandatory schedules type, which defines the period and time window for the alert

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Search-Id')]"
  ],
  "properties": {
    "Interval": 5,
    "QueryTimeSpan": 5,
    "Active": "true"
  }
}

A mandatory actions type, defining the trigger and email notification settings

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Alert-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Schedule-Id')]"
  ],
  "properties": {
    "Type": "Alert",
    "Name": "[parameters('Search-DisplayName')]",
    "Threshold": {
      "Operator": "gt",
      "Value": 0
    },
    "Version": 1
  }
}

An optional, secondary, actions type, defining the alert's webhook settings

{
  "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Action-Name'))]",
  "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
  "apiVersion": "2015-03-20",
  "location": "[parameters('Workspace-Location')]",
  "dependsOn": [
    "[variables('Schedule-Id')]"
  ],
  "properties": {
    "Type": "Webhook",
    "Name": "[variables('Action-Name')]",
    "WebhookUri": "[parameters('Action-Uri')]",
    "CustomPayload": "[parameters('Action-Payload')]",
    "Version": 1
  }
}

The Caveats

The full template adds a few eccentricities that are necessary for fully functioning resources.

First, let's talk about the search.  At the OMS interface, when you create a search, it is named <Category>|<Name> behind the scenes.  It is also lower-cased.  Now, the APIs will not downcase any name you send in, but if you use the interface to update that API-deployed search, you will duplicate it rather than being prompted to overwrite.  This tells us that for whatever reason, OMS' identifiers are case-sensitive.  This is why the template accepts a search category and name in parameters, then use variables to lower them and match the interface's naming convention.

Second, a schedule's name must be globally unique to your workspace, though the actions need not be.  Again, the OMS UI will take over here and generate GUIDs for the resources behind the scenes.  Unfortunately for us, ARM Templates don't support random number or GUID generation - a necessary evil of being deterministic.  While the template could accept a GUID by parameter, I find that asking my users for GUIDs is impolite - a uniquestring() function on the already-unique search name should suffice for most use cases.  Since the actions don't require uniqueness, static names will do.

Lastly, this template is not idempotent.  Idempotency is a characteristic of a desired state technology that focuses on ensuring the state (in our case, the template), rather than prescribing process (create this, update that, delete the other thing).  Each of Azure's types is responsible for ensuring that they implement idempotency as much as possible.  For some resources like the database extensions type, which imports .bacpac files, the entire type opts out of idempotency (once data is in the database, no more importing).  For others (such as Azure VMs), only changes in certain properties like administrator name/password break idempotency.  In the OMS search and schedule's case, if either one exists when you deploy the template, the deployment will fail with a code of BadRequest.  To add insult to injury, deleting the saved search from the interface does not delete the schedule.  Neither does deleting the alert from the UI - it simply flips the scheduled to "Enabled": false.  There is, in fact, no way to delete the schedule via the UI - you must do so via the REST API, similar to the following:

armclient delete /{Schedule ResourceID}?api-version=2015-03-20

This means that, for now, any versioned solution which desires to deploy ARM template updates to its searches and alerts will need to either:

  1. Delete the existing search/schedule, then redeploy OR
  2. Deploy the updated version side-by-side (named differently), then clean up the prior version's items.

Luckily ours was a POC, so deleting and re-creating was very much an option.  I look forward to days when search and alerting types become first-class, idempotent citizens.  Until then, I hope this approach proves useful!

The Template

The template below deploys a search, a schedule, a threshold action (the alert), and a webhook action (the action) that includes a custom JSON payload (remember to escape your JSON!).  With a little work, this template could be reworked to segregate the alert-specific components from the search so that multiple alerts could be deployed to the same search or adapted for use with Azure Automation Runbooks.  Enjoy!

{
  "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "Action-Payload": {
      "type": "string"
    },
    "Action-Uri": {
      "type": "string"
    },
    "Alert-DisplayName": {
      "type": "string"
    },
    "Search-Category": {
      "type": "string"
    },
    "Search-DisplayName": {
      "type": "string"
    },
    "Search-Query": {
      "type": "string"
    },
    "Workspace-Name": {
      "type": "string"
    },
    "Workspace-Location": {
      "type": "string"
    }
  },
  "variables": {
    "Action-Name": "webhook",
    "Alert-Name": "alert",
    "Schedule-Name": "[uniquestring(variables('Search-Name'))]",
    "Search-Name": "[toLower(concat(variables('Search-Category'), '|', parameters('Search-DisplayName')))]",
    "Search-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches', parameters('Workspace-Name'), variables('Search-Name'))]",
    "Schedule-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'))]",
    "Alert-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'), variables('Alert-Name'))]",
    "Action-Id": "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('Workspace-Name'), variables('Search-Name'), variables('Schedule-Name'), variables('Action-Name'))]"
  },
  "resources": [
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [],
      "properties": {
        "Category": "[parameters('Search-Category')]",
        "DisplayName": "[parameters('Search-DisplayName')]",
        "Query": "[parameters('Search-Query')]",
        "Version": "1"
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Search-Id')]"
      ],
      "properties": {
        "Interval": 5,
        "QueryTimeSpan": 5,
        "Active": "true"
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Alert-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Schedule-Id')]"
      ],
      "properties": {
        "Type": "Alert",
        "Name": "[parameters('Search-DisplayName')]",
        "Threshold": {
          "Operator": "gt",
          "Value": 0
        },
        "Version": 1
      }
    },
    {
      "name": "[concat(parameters('Workspace-Name'), '/', variables('Search-Name'), '/', variables('Schedule-Name'), '/', variables('Action-Name'))]",
      "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
      "apiVersion": "2015-03-20",
      "location": "[parameters('Workspace-Location')]",
      "dependsOn": [
        "[variables('Schedule-Id')]"
      ],
      "properties": {
        "Type": "Webhook",
        "Name": "[variables('Action-Name')]",
        "WebhookUri": "[parameters('Action-Uri')]",
        "CustomPayload": "[parameters('Action-Payload')]",
        "Version": 1
      }
    }
  ],
  "outputs": {}
}
Read More

Saving money in the cloud?

MoneyCloud.png

One of the cloud’s big selling points is the promise of lower costs, but more often than not customers who move servers to the cloud end up paying more for the same workload.  Have we all been duped?  Is the promise a lie? Over the past several years the ACE team (the group of experts behind the AzureFieldNotes blog) has helped a number of customers on their Azure journey, many of whom were motivated by the economic benefits of moving to the cloud.  Few take the time to truly understand the business value as it applies to their unique technology estate and develop plans to achieve and measure the benefits.  Most simply assume that running workloads in the cloud will result in lower costs - the more they move, the more they will save.  As a result, management establishes a "Cloud First" initiative and IT scrambles to find workloads that are low risk, low complexity candidates.  Inevitably, these end up being existing virtual machines or physical servers which can be easily migrated to Azure.  And here is where the problems begin.

When customers view Azure as simply another datacenter (which just happens to be in the cloud) they apply their existing datacenter thinking to Azure workloads and they negate any cost benefit.  To realize the savings from cloud computing customers need to shift into consumption-based models and this goes far beyond simply migrating virtual machines to Azure.  When server instances are deployed just like those in the old datacenter and left running 24x7, the same workload will most likely end up costing more in Azure.  In addition, if instances aren't decommissioned when no longer needed it leads to sprawl, environment complexity, and costs that quickly get out of control.

Taking it a step further, customers must also consider which services should continue to be built and maintained in-house, and which should simply be consumed as a service.  These decisions will shape the technical cloud foundations for the enterprise.  Unfortunately, many of these decisions are made based on early applications deployed to Azure.  We call this the "first mover" issue.  Decisions made to support the first app in the cloud may not be the right decisions for subsequent apps or for the enterprise as a whole, leading to redundant and perhaps incompatible architecture, poor performance, higher complexity, and ultimately higher cost.  Take identity as an example:  existing identity solutions deployed in-house are often sacred cows because of the historical investment and specialized skills required to maintain the platform.  Previously, these investments were necessary because the only way to deliver this function was to build your own.  But (with limited exception) identity doesn't differentiate your core business and customers don't pay more or buy more product because of your beloved identity solution.  With the introduction of cloud-based identity, such as Azure Active Directory, companies can now choose to consume identity as a service, eliminate the complexity and specialized skills required to support in-house solutions, and focus talent and resources on higher value services which can truly differentiate the business.

Breaking it down, there are a handful of critical elements that must be addressed for any customer to realize value in the cloud:

  • Business Case:  understand what is valuable to your business, how you measure those things, and how you will achieve the value.  The answers to these questions will be different for every customer, but the need to answer them is universal.  Assuming the cloud will bring value - whether you view value as speed to market, cost reduction, evergreen, simplification, etc. - without understanding how you achieve and measure that goal is a recipe for failure.
  • Cloud Foundations:  infrastructure components that will be shared across all services need to be designed for the Enterprise, and not driven based on the first mover.  Its not unusual for Azure environments to quickly evolve from early Proof of Concept deployments to running production workloads, but the foundations (such as subscription model, network, storage, compute, backup, security, identity, etc.) were never designed for production - you need to spend the time early to get these right or your ability to realize results from Azure will be negatively impacted.
  • Ruthless automation:  standardization and automation underpin virtually every element of the cloud's value proposition and you must embrace them to realize maximum benefit from the cloud.  This goes beyond systems admins having scripts to automate build processes (although that is a start).  It means build and configuration become part of the software development practice, including version control, testing, and design patterns.  In other words, you write code to provision and manage cloud resources and the underlying infrastructure is treated just like software:  infrastructure as code.
  • Operating Model: workloads running in the cloud are different from those in your datacenter and supporting these instances will require changes to the traditional operating model.  As you move higher into the as-a-Service stack (IaaS -> PaaS -> SaaS -> BPaaS etc.) the management layer shifts more and more to the cloud provider.  Introduce DevOps in the equation and the impact to traditional operating models is even greater.  When there is an issue, how is the root cause determined when you don't have a single party responsible for the full stack?  Who is responsible for resolution of service and how will hand-offs work between the cloud provider and your in-house support teams?  What tools are involved, what skills are required, and how is information tracked and communicated?  In the end, much of the savings from cloud can come from transformation within the operating model.
  • Governance and Controls:  If you thought keeping a handle on systems running in your datacenter was a challenge, the cloud can make it exponentially worse.  Self-service and near instantaneous access to resources is the perfect storm for introducing server sprawl without proper governance and controls.  In addition, since cloud resources aren't sitting within the datacenter where IT has full control of the entire stack, how can you be sure data is secure, systems are protected, and the company is not exposed to regulatory or legal risk?

In future posts I'll cover each one of these in more detail to help frame how you can maximize the value of Azure (and how Azure Stack can play an important role) in your cloud journey.

 

 

 

Read More
Uncategorized Mike DeLuca Uncategorized Mike DeLuca

Azure Stack TP2 Hacks: Custom Domain Names and Exposing to the Internet

ss.png

In some previous posts, we covered some "hacks" to Azure Stack TP1, primarily enabling a customized domain name and exposing to the internet.  If you have not noticed yet, the installation has changed greatly. The process is now driven by ECEngine and should be far more indicative of how the final product gets deployed. While the installer has greatly changed, fortunately, the process to expose the stack publicly has only changed in a few minor ways. Without getting too involved in how it works, the installation operates from a series of PowerShell modules and Pester tests tied to a configuration composed from a number XML configuration files. The configuration files support use of variables and parameters to drive most of the PowerShell action. As with TP1, the stack is wired so that the DNS domain name for Active Directory must match the public DNS domain name (think certificates and host headers). This is a much less involved change to TP2, it mostly requires replacing a couple of straggling hard coded entries with variables in some OneNodeConfig.xml files and changing the installer bootstrapper to use it.  Once again, I will admonish you that this is wholly unsupported.

There are 6 files that need minor changes, we will start with the XML config files.

Config Files

C:\CloudDeployment\Configuration\Roles\Fabric\IdentityProvider\OneNodeRole.xml Line 11 From

[xml] <IdentityApplication Name="Deployment" ResourceId="https://deploy.azurestack.local/[Deployment_Guid]" DisplayName="Deployment Application" CertPath="{Infrastructure}\ASResourceProvider\Cert\Deployment.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Deployment.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

To

[xml] <IdentityApplication Name="Deployment" ResourceId="https://deploy.[DOMAINNAMEFQDN]/[Deployment_Guid]" DisplayName="Deployment Application" CertPath="{Infrastructure}\ASResourceProvider\Cert\Deployment.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Deployment.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

C:\CloudDeployment\Configuration\Roles\Fabric\KeyVault\OneNodeRole.xml Line 12 From

[xml] <IdentityApplication Name="KeyVault" ResourceId="https://vault.azurestack.local/[Deployment_Guid]" DisplayName="AzureStack KeyVault" CertPath="{Infrastructure}\ASResourceProvider\Cert\KeyVault.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\KeyVault.IdentityApplication.Configuration.json" > <AADPermissions> <ApplicationPermission Name="ReadDirectoryData" /> </AADPermissions> <OAuth2PermissionGrants> <FirstPartyApplication FriendlyName="PowerShell" /> <FirstPartyApplication FriendlyName="VisualStudio" /> <FirstPartyApplication FriendlyName="AzureCLI" /> </OAuth2PermissionGrants> </IdentityApplication> [/xml]

To

[xml] <IdentityApplication Name="KeyVault" ResourceId="https://vault.[DOMAINNAMEFQDN]/[Deployment_Guid]" DisplayName="AzureStack KeyVault" CertPath="{Infrastructure}\ASResourceProvider\Cert\KeyVault.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\KeyVault.IdentityApplication.Configuration.json" > <AADPermissions> <ApplicationPermission Name="ReadDirectoryData" /> </AADPermissions> <OAuth2PermissionGrants> <FirstPartyApplication FriendlyName="PowerShell" /> <FirstPartyApplication FriendlyName="VisualStudio" /> <FirstPartyApplication FriendlyName="AzureCLI" /> </OAuth2PermissionGrants> </IdentityApplication> [/xml]

Line 26 From

[xml] <AzureKeyVaultSuffix>vault.azurestack.local</AzureKeyVaultSuffix> [/xml]

To

[xml] <AzureKeyVaultSuffix>vault[DOMAINNAMEFQDN]</AzureKeyVaultSuffix> [/xml]

C:\CloudDeployment\Configuration\Roles\Fabric\WAS\OneNodeRole.xml Line(s) 96-97 From

[xml] <IdentityApplication Name="ResourceManager" ResourceId="https://api.azurestack.local/[Deployment_Guid]" HomePage="https://api.azurestack.local/" DisplayName="AzureStack Resource Manager" CertPath="{Infrastructure}\ASResourceProvider\Cert\ResourceManager.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\ResourceManager.IdentityApplication.Configuration.json" Tags="MicrosoftAzureStack" > [/xml]

To

[xml] <IdentityApplication Name="ResourceManager" ResourceId="https://api.[DOMAINNAMEFQDN]/[Deployment_Guid]" HomePage="https://api.[DOMAINNAMEFQDN]/" DisplayName="AzureStack Resource Manager" CertPath="{Infrastructure}\ASResourceProvider\Cert\ResourceManager.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\ResourceManager.IdentityApplication.Configuration.json" Tags="MicrosoftAzureStack" > [/xml]

Line(s) 118-120 From

[xml] <IdentityApplication Name="Portal" ResourceId="https://portal.azurestack.local/[Deployment_Guid]" HomePage="https://portal.azurestack.local/" ReplyAddress="https://portal.azurestack.local/" DisplayName="AzureStack Portal" CertPath="{Infrastructure}\ASResourceProvider\Cert\Portal.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Portal.IdentityApplication.Configuration.json" > [/xml]

To

[xml] <IdentityApplication Name="Portal" ResourceId="https://portal.[DOMAINNAMEFQDN]/[Deployment_Guid]" HomePage="https://portal.[DOMAINNAMEFQDN]/" ReplyAddress="https://portal.[DOMAINNAMEFQDN]/" DisplayName="AzureStack Portal" CertPath="{Infrastructure}\ASResourceProvider\Cert\Portal.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Portal.IdentityApplication.Configuration.json" > [/xml]

Line 129 From

[xml] <ResourceAccessPermissions> <UserImpersonationPermission AppURI="https://api.azurestack.local/[Deployment_Guid]" /> </ResourceAccessPermissions> [/xml]

To

[xml] <ResourceAccessPermissions> <UserImpersonationPermission AppURI="https://api.[DOMAINNAMEFQDN]/[Deployment_Guid]" /> </ResourceAccessPermissions> [/xml]

Line 133 From

[xml] <IdentityApplication Name="Policy" ResourceId="https://policy.azurestack.local/[Deployment_Guid]" DisplayName="AzureStack Policy Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Policy.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Policy.IdentityApplication.Configuration.json" > [/xml]

To

[xml] <IdentityApplication Name="Policy" ResourceId="https://policy.[DOMAINNAMEFQDN]/[Deployment_Guid]" DisplayName="AzureStack Policy Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Policy.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Policy.IdentityApplication.Configuration.json" > [/xml]

Line 142 From

[xml] <IdentityApplication Name="Monitoring" ResourceId="https://monitoring.azurestack.local/[Deployment_Guid]" DisplayName="AzureStack Monitoring Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Monitoring.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Monitoring.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

To

[xml] <IdentityApplication Name="Monitoring" ResourceId="https://monitoring.[DOMAINNAMEFQDN]/[Deployment_Guid]" DisplayName="AzureStack Monitoring Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Monitoring.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Monitoring.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

C:\CloudDeployment\Configuration\Roles\Fabric\FabricRingServices\XRP\OneNodeRole.xml Line 114 From

[xml] <IdentityApplication Name="Monitoring" ResourceId="https://monitoring.azurestack.local/[Deployment_Guid]" DisplayName="AzureStack Monitoring Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Monitoring.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Monitoring.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

To

[xml] <IdentityApplication Name="Monitoring" ResourceId="https://monitoring.[DOMAINNAMEFQDN]/[Deployment_Guid]" DisplayName="AzureStack Monitoring Service" CertPath="{Infrastructure}\ASResourceProvider\Cert\Monitoring.IdentityApplication.ClientCertificate.pfx" ConfigPath="{Infrastructure}\ASResourceProvider\Config\Monitoring.IdentityApplication.Configuration.json" > </IdentityApplication> [/xml]

Scripts

Now we will edit the installation bootstrapping scripts.

We will start by adding two new parameters ($ADDomainName and $DomainNetbiosName) to C:\CloudDeployment\Configuration\New-OneNodeManifest.ps1 and have the manifest generation use them.

[powershell] param ( [Parameter(Mandatory=$true)] [Xml] $InputXml,

[Parameter(Mandatory=$true)] [String] $OutputFile,

[Parameter(Mandatory=$true)] [System.Guid] $DeploymentGuid,

[Parameter(Mandatory=$false)] [String] $Model,

[Parameter(Mandatory=$true)] [String] $HostIPv4Address,

[Parameter(Mandatory=$true)] [String] $HostIPv4DefaultGateway,

[Parameter(Mandatory=$true)] [String] $HostSubnet,

[Parameter(Mandatory=$true)] [bool] $HostUseDhcp,

[Parameter(Mandatory=$true)] [string] $PhysicalMachineMacAddress,

[Parameter(Mandatory=$true)] [String] $HostName,

[Parameter(Mandatory=$true)] [String] $NatIPv4Address,

[Parameter(Mandatory=$true)] [String] $NATIPv4Subnet,

[Parameter(Mandatory=$true)] [String] $NatIPv4DefaultGateway,

[Parameter(Mandatory=$false)] [Int] $PublicVlanId,

[Parameter(Mandatory=$true)] [String] $TimeServer,

[Parameter(Mandatory=$true)] [String] $TimeZone,

[Parameter(Mandatory=$true)] [String[]] $EnvironmentDNS,

[Parameter(Mandatory=$false)] [String] $ADDomainName='azurestack.local',

[Parameter(Mandatory=$false)] [String] $DomainNetbiosName='azurestack',

[Parameter(Mandatory=$true)] [string] $AADDirectoryTenantName,

[Parameter(Mandatory=$true)] [string] $AADDirectoryTenantID,

[Parameter(Mandatory=$true)] [string] $AADAdminSubscriptionOwner,

[Parameter(Mandatory=$true)] [string] $AADClaimsProvider ) $Xml.InnerXml = $Xml.InnerXml.Replace('[PREFIX]', 'MAS') $Xml.InnerXml = $Xml.InnerXml.Replace('[DOMAINNAMEFQDN]', $ADDomainName) $Xml.InnerXml = $Xml.InnerXml.Replace('[DOMAINNAME]', $DomainNetbiosName) [/powershell]

The final edit(s) we need to make are to C:\CloudDeployment\Configuration\InstallAzureStackPOC.ps1. We will start by adding the same parameters to this script.

[powershell] [CmdletBinding(DefaultParameterSetName="DefaultSet")] param ( [Parameter(Mandatory=$false, ParameterSetName="RerunSet")] [Parameter(Mandatory=$true, ParameterSetName="AADSetStaticNAT")] [Parameter(Mandatory=$true, ParameterSetName="DefaultSet")] [SecureString] $AdminPassword,

[Parameter(Mandatory=$false)] [PSCredential] $AADAdminCredential,

[Parameter(Mandatory=$false)] [String] $AdDomainName='azurestack.local',

[Parameter(Mandatory=$false)] [String] $DomainNetbiosName='AzureStack',

[Parameter(Mandatory=$false)] [String] $AADDirectoryTenantName,

[Parameter(Mandatory=$false)] [ValidateSet('Public Azure','Azure - China', 'Azure - US Government')] [String] $AzureEnvironment = 'Public Azure',

[Parameter(Mandatory=$false)] [String[]] $EnvironmentDNS,

[Parameter(Mandatory=$true, ParameterSetName="AADSetStaticNAT")] [String] $NATIPv4Subnet,

[Parameter(Mandatory=$true, ParameterSetName="AADSetStaticNAT")] [String] $NATIPv4Address,

[Parameter(Mandatory=$true, ParameterSetName="AADSetStaticNAT")] [String] $NATIPv4DefaultGateway,

[Parameter(Mandatory=$false)] [Int] $PublicVlanId,

[Parameter(Mandatory=$false)] [string] $TimeServer = 'time.windows.com',

[Parameter(Mandatory=$false, ParameterSetName="RerunSet")] [Switch] $Rerun ) [/powershell]

The next edit will occur at lines 114-115 From

[powershell] $FabricAdminUserName = 'AzureStack\FabricAdmin' $SqlAdminUserName = 'AzureStack\SqlSvc' [/powershell]

To

[powershell] $FabricAdminUserName = "$DomainNetbiosName\FabricAdmin" $SqlAdminUserName = "$DomainNetbiosName\SqlSvc" [/powershell]

Finally we will modify the last statement of the script from line 312 to pass the new parameters.

[powershell] & $PSScriptRoot\New-OneNodeManifest.ps1 -InputXml $xml ` -OutputFile $outputConfigPath ` -Model $model ` -DeploymentGuid $deploymentGuid ` -HostIPv4Address $hostIPv4Address ` -HostIPv4DefaultGateway $hostIPv4Gateway ` -HostSubnet $hostSubnet ` -HostUseDhcp $hostUseDhcp ` -PhysicalMachineMacAddress $physicalMachineMacAddress ` -HostName $hostName ` -NATIPv4Address $NATIPv4Address ` -NATIPv4Subnet $NATIPv4Subnet ` -NATIPv4DefaultGateway $NATIPv4DefaultGateway ` -PublicVlanId $PublicVlanId ` -TimeServer $TimeServer ` -TimeZone $timezone ` -EnvironmentDNS $EnvironmentDNS ` -AADDirectoryTenantName $AADDirectoryTenantName ` -AADDirectoryTenantID $AADDirectoryTenantID ` -AADAdminSubscriptionOwner $AADAdminSubscriptionOwner ` -AADClaimsProvider $AADClaimsProvider ` -ADDomainName $AdDomainName ` -DomainNetbiosName $DomainNetbiosName [/powershell]

NAT Configuration

So, you now have customized the Domain for your one node Azure Stack install and want to get it on the internet. This process is almost identical to TP1 save for two changes. In TP1 there were both BGPVM and NATVM machines; while there is now a single machine MAS-BGPNAT01. The BGPNAT role only exists in the one node (HyperConverged) installation. The other change is the type of Remote Access installation. TP1 also used the "legacy" RRAS for NAT, where all configuration was UI or netsh based. TP2 has transitioned to "modern" Remote Access that is only really manageable through PowerShell. To enable the appropriate NAT mappings we will need to use three PowerShell Cmdlets. Get-NetNat Add-NetNatExternalAddress Add-NetNatStaticMapping I use a script to create all the mappings which takes a simple object, which in our use case is deserialized from JSON. This file is a simple collection of the NAT entries and mappings to be created.

[javascript] { "Portal": { "External": "172.20.40.39", "Ports": [ 80,443,30042,13011,30011,30010,30016,30015,13001,13010,13021,30052,30054,13020,30040,13003,30022,12998,12646,12649,12647,12648,12650,53056,57532,58462,58571,58604,58606,58607,58608,58610,58613,58616,58618,58619,58620,58626,58627,58628,58629,58630,58631,58632,58633,58634,58635,58636,58637,58638,58639,58640,58641,58642,58643,58644,58646,58647,58648,58649,58650,58651,58652,58653,58654,58655,58656,58657,58658,58659,58660,58661,58662,58663,58664,58665,58666,58667,58668,58669,58670,58671,58672,58673,58674,58675,58676,58677,58678,58679,58680,58681,58682,58683,58684,58685,58686,58687,58688,58689,58690,58691,58692,58693,58694,58695,58696,58697,58698,58699,58701 ], "Internal": "192.168.102.5" }, "API": { "External": "172.20.40.38", "Ports": [ 80,443,30042,13011,30011,30010,30016,30015,13001,13010,13021,30052,30054,13020,30040,13003,30022,12998,12646,12649,12647,12648,12650,53056,57532,58462,58571,58604,58606,58607,58608,58610,58613,58616,58618,58619,58620,58626,58627,58628,58629,58630,58631,58632,58633,58634,58635,58636,58637,58638,58639,58640,58641,58642,58643,58644,58646,58647,58648,58649,58650,58651,58652,58653,58654,58655,58656,58657,58658,58659,58660,58661,58662,58663,58664,58665,58666,58667,58668,58669,58670,58671,58672,58673,58674,58675,58676,58677,58678,58679,58680,58681,58682,58683,58684,58685,58686,58687,58688,58689,58690,58691,58692,58693,58694,58695,58696,58697,58698,58699,58701 ], "Internal": "192.168.102.4" }, "DataVault": { "External": "172.20.40.43", "Ports": [80,443], "Internal": "192.168.102.3" }, "CoreDataVault": { "External": "172.20.40.44", "Ports": [80,443], "Internal": "192.168.102.3" }, "Graph": { "External": "172.20.40.40", "Ports": [80,443], "Internal": "192.168.102.8" }, "Extensions": { "External": "172.20.40.41", "Ports": [ 80,443,30042,13011,30011,30010,30016,30015,13001,13010,13021,30052,30054,13020,30040,13003,30022,12998,12646,12649,12647,12648,12650,53056,57532,58462,58571,58604,58606,58607,58608,58610,58613,58616,58618,58619,58620,58626,58627,58628,58629,58630,58631,58632,58633,58634,58635,58636,58637,58638,58639,58640,58641,58642,58643,58644,58646,58647,58648,58649,58650,58651,58652,58653,58654,58655,58656,58657,58658,58659,58660,58661,58662,58663,58664,58665,58666,58667,58668,58669,58670,58671,58672,58673,58674,58675,58676,58677,58678,58679,58680,58681,58682,58683,58684,58685,58686,58687,58688,58689,58690,58691,58692,58693,58694,58695,58696,58697,58698,58699,58701 ], "Internal": "192.168.102.7" }, "Storage": { "External": "172.20.40.42", "Ports": [80,443], "Internal": "192.168.102.6" } } [/javascript]

In the one node TP2 deployment 192.168.102.0 is the subnet for "Public" IP addresses, and if you notice all the VIP's for the stack reside on that subnet. We have 1-to-1 NAT for all the "External" addresses we associate with a given Azure Stack instance.

[powershell] [CmdletBinding()] param ( [Parameter(Mandatory=$true)] [psobject] $NatConfig )

#There's only one could do -Name BGPNAT ... $NatSetup=Get-NetNat

$NatConfigNodeNames=$NatConfig|Get-Member -MemberType NoteProperty|Select-Object -ExpandProperty Name

foreach ($NatConfigNodeName in $NatConfigNodeNames) { Write-Verbose "Configuring NAT for Item $NatConfigNodeName" $ExIp=$NatConfig."$NatConfigNodeName".External $InternalIp=$NatConfig."$NatConfigNodeName".Internal $NatPorts=$NatConfig."$NatConfigNodeName".Ports Write-Verbose "Adding External Address $ExIp" Add-NetNatExternalAddress -NatName $NatSetup.Name -IPAddress $ExIp -PortStart 80 -PortEnd 63356 Write-Verbose "Adding Static Mappings"

foreach ($natport in $NatPorts) { #TCP Write-Verbose "Adding NAT Mapping $($ExIp):$($natport)->$($InternalIp):$($natport)" Add-NetNatStaticMapping -NatName $NatSetup.Name -Protocol TCP ` -ExternalIPAddress $ExIp -InternalIPAddress $InternalIp ` -ExternalPort $natport -InternalPort $NatPort

} } [/powershell]

DNS Records

The final step will be adding the requisite DNS Entries, which have changed slightly as well.  In the interest of simplicity assume the final octet of the IP addresses on the 172.20.40.0 subnet have a 1 to 1 NAT mapping to 38.77.x.0 (e.g. 172.20.40.40 –> 38.77.x.40)

A Record IP Address
api 38.77.x.38
portal 38.77.x.39
*.blob 38.77.x.42
*.table 38.77.x.42
*.queue 38.77.x.42
*.vault 38.77.x.43
data.vaultcore 38.77.x.44
control.vaultcore 38.77.x.44
xrp.tenantextensions 38.77.x.44
compute.adminextensions 38.77.x.41
network.adminextensions health.adminextensions 38.77.x.41
storage.adminextensions 38.77.x.41

 

Connecting to the Stack

You will need to export the root certificate from the CA for your installation for importing on any clients that will access your deployment.   Exporting the root certificate is very simple as the host system is joined to the domain which hosts the Azure Stack CA. To export the Root certificate to your desktop run this simple one-liner in the PowerShell console of your Host system (the same command will work from the Console VM).

[powershell] Get-ChildItem -Path Cert:\LocalMachine\Root| ` Where-Object{$_.Subject -like "CN=AzureStackCertificationAuthority*"}| ` Export-Certificate -FilePath "$env:USERPROFILE\Desktop\$($env:USERDOMAIN)RootCA.cer" -Type CERT [/powershell]

The process for importing this certificate on your client will vary depending on the OS version; as such I will avoid giving a scripted method.

Right click the previously exported certificate.

installcert.png

Choose Current User for most use-cases.

imprt2_thumb.png

Select Browse for the appropriate store.

imprt3.png

Select Trusted Root Certificate Authorities

imprt4.png

Confirm the Import Prompt

To connect with PowerShell or REST API you will need the deployment GUID. This can be obtained from the host with the following snippet.

[powershell] [xml]$deployinfo = Get-content "C:\CloudDeployment\Config.xml" $deploymentguid = $deployinfo.CustomerConfiguration.Role.Roles.Role.Roles.Role | % {$_.PublicInfo.DeploymentGuid} [/powershell]

This value can then be used to connect to your stack.

[powershell] #Deployment GUID $EnvironmentID='4bc6f444-ff15-4fd7-9bfa-5495891fe876' #The DNS Domain used for the Install $StackDomain="yourazurestack.com" #The AAD Domain Name (e.g. bobsdomain.onmicrosoft.com) $AADDomainName='youraadtenant.com' #The AAD Tenant ID $AADTenantID = "youraadtenant.com" #The Username to be used $AADUserName="username@$AADDomainName" #The Password to be used $AADPassword='P@ssword1'|ConvertTo-SecureString -Force -AsPlainText #The Credential to be used. Alternatively could use Get-Credential $AADCredential=New-Object PSCredential($AADUserName,$AADPassword) #The AAD Application Resource URI $ApiAADResourceID="https://api.$StackDomain/$EnvironmentID" #The ARM Endpoint $StackARMUri="Https://api.$StackDomain/" #The Gallery Endpoint $StackGalleryUri="Https://portal.$($StackDomain):30016/" #The OAuth Redirect Uri $AadAuthUri="https://login.windows.net/$AADTenantID/" #The MS Graph API Endpoint $GraphApiEndpoint="https://graph.windows.net/"

#Add the Azure Stack Environment Add-AzureRmEnvironment -Name "Azure Stack" ` -ActiveDirectoryEndpoint $AadAuthUri ` -ActiveDirectoryServiceEndpointResourceId $ApiAADResourceID ` -ResourceManagerEndpoint $StackARMUri ` -GalleryEndpoint $StackGalleryUri ` -GraphEndpoint $GraphApiEndpoint

#Add the environment to the context using the credential $env = Get-AzureRmEnvironment 'Azure Stack' Add-AzureRmAccount -Environment $env -Credential $AADCredential -Verbose [/powershell]

Note: You will need the a TP2 specific version of the Azure PowerShell for many operations.  Enjoy, and stay tuned for more.

Read More