Introduction

IaC has revolutionized the way we implement our architecture. Terraform is one of the most widely used tool for implementing architecture. Terraform not only makes the implementation easy, but also helps us define modules which can be reused again and again for future deployments. In this series I will share the terraform code that I have written to deploy entire architecture.

Architecture that we are going to deploy

Optimized storage – time based with Data Lake.

Potential use cases :

The architecture may be appropriate for any application that uses massive amounts of data that must always be available. Examples include apps that:

  • Track customer spending habits and shopping behavior.
  • Forecast weather.
  • Offer smart traffic systems or implement smart traffic systems or use smart technology to monitor traffic.
  • Analyze manufacturing Internet of Things (IoT) data.
  • Display smart meter data or use smart technology to monitor meter data.

Assumptions in the architecture :

  1. The client authenticates with Azure Active Directory (Azure AD) and is granted access to web applications hosted on Azure App Service.
  2. Azure Front Door, a firewall and layer 7 load balancer, switches user traffic to a different Azure region in case of a regional outage.
  3. Azure App Service hosts websites and RESTful web APIs. Browser clients run AJAX applications that use the APIs.
  4. Web APIs delegate function apps to handle background tasks. The tasks are queued in Azure Queue Storage queues.
  5. The function apps hosted by Azure Functions perform the background tasks, triggered by the queued messages.
  6. Azure Cache for Redis caches database data for the function apps. This offloads database activity and speeds up the function apps and web apps.
  7. Azure Cosmos DB holds 3 to 4 months of the most recent data used by the web applications.
  8. Data Lake Storage holds historical data used by the web applications. Periodically, Azure Data Factory moves data from Azure Cosmos DB to Azure Data Lake to reduce storage costs.

Terraform Explanation

  1. GitHub Link : GitHub
YouTube video with the explanation

This code is parameterized and we are using different modules for every Azure Resource. This allows us to manage our code along with enabling the reusability. The parent directory Data Lake Deployment contains two subfolders :

  1. terraform-modules
    • This directory contains all the modules.
    • Any variable conditional needs to be checked in the variable.tf file under respective module
  2. terraform-resources
    • This folder has these files :
      • main.tf -> This file contains all the resources that we are going to deploy.
      • variables.tf -> Variables are defined in this file.
      • terraform.tfvars -> Tfvars file is automatically loaded without any additional option. This is the file where you need to update your resource values\names.

Usage :

Things to keep in mind are :

  1. We are using Azure BLOB as backend configuration, so you need to ensure the storage account and container mentioned in the backend configuration exists.
  2. Since we are using data block to fetch key vault secrets, where we store all our sensitive information. The key vault and the secrets should already exist.
  3. Data lake linked service is configured using “Data Lake Deployment\terraform-modules\data-lake\main.tf” file. I haven’t written code to implement data set because that can vary depending on the requirement. But this can be easily added based on the requirement.
  4. App service site config is configured using line number 22 to 26 under “Data Lake Deployment\terraform-modules\webapp\main.tf” file. The application framework can be changed using these lines (ex: Java, Python, DotNet etc.)

To run this example, simply follow to steps below:

  1. Open terraform.tfvars file under terraform-resources folder and update parameters based on your requirement. It can be region\resource group\SKU.
  2. Open command prompt or powershell and change the working directory to terraform-resources folder, if the terminal is opened in Data Lake Deployment directory, use :
  cd terraform-resources

  terraform init
  terraform plan
  terraform apply


Explanation :

  1. Terraform init -> It initializes the directory and downloads required provider along with configuring the module.
  2. Terraform plan -> This helps you verify the code is going to deploy the resources as expected. This also ensures we don’t face any unwanted surprise. This isn’t mandatory, but a recommended step.
  3. Terraform apply -> This step applies the resources specified in this code. This will ask you to approve this later. We can skip manual approval by using –auto-approve parameter.

Best Practices & Recommendations

  1. Use Terraform workspaces for easier management of the deployments. This can also help us manage Dev, UAT and Production deployments instead of creating multiple state files\directories.
  2. If you are creating new resources\variables. Ensure naming convention is easily relatable, since we have a lot of variables in this code.
  3. Use conditionals to avoid unwanted surprises.