Managing a Kubernetes Cluster Using Terraform

Managing a Kubernetes Cluster Using Terraform

Table of Contents

  1. Introduction
  2. Prerequisites
  3. Creating our First GKE Cluster Using Terraform
    1. Terraform Providers
    2. Terraform Resources
    3. Terraform Plan
    4. Terraform Apply
  4. Updating Resources with Terraform
  5. Destroying a Cluster Using Terraform
  6. Conclusion

Introduction

Kubernetes (K8s) is one of the most popular open-source container orchestration and scheduling tools. Google developed it, but it is not the only contributor. Many other independent developers and companies like Red Hat, Huawei, Microsoft, and IBM contribute to the development of this tool.

Kubernetes has a client/server architecture. In a Kubernetes cluster, you will always find a master and worker(s). The master or the Kubernetes Control Plane acts as a controlling node. The master is made of multiple components like kube-scheduler, kube-apiserver, etcd, kube-controller-manager. By default, a Kubernetes cluster has one master, but it is possible to set up a multi-master Kubernetes cluster. In both cases, a master controls worker nodes.

A node, previously known as a minion, is a worker machine, usually a VM, but can also be a bare-metal machine. Each node comprises the required services used by the master to manage pods. e.g., Kubelet, the container runtime, and kube-proxy.

By looking at the Kubernetes architecture, we realize that it's a complex system. This complexity is somehow required to create such a resilient and abstract system. The complexity is not just functional but lies in the deployment and the maintenance of a Kubernetes cluster.

To create your own Kubernetes cluster, you should provision your own resources and certificates. Generate your own Kubernetes configurations for authentication, manage the data encryption, bootstrap the etcd cluster, control plane, worker nodes, manager pod networking routes, setup the DNS add-on, and smoke-test it. Some open-source tools can help you in doing this, still deploying and managing your own Kubernetes cluster is not an easy task. This is the reason many companies choose the ease of using managed Kubernetes clusters like GKE.

Cloud-managed clusters make using Kubernetes easier since you don't need to maintain your cluster and its dependencies. With IaC (infrastructure as code), bootstrapping a Kubernetes cluster is even easier. It also has many advantages since it allows you to create and maintain different Kubernetes environments; you can also add your infrastructure to version control and share it across teams and individuals.

One of the pillars of DevOps is the self-service infrastructure. Tools like Terraform allows you to create and validate infrastructure templates to use and reuse for on-demand provisioning. In this blog post, we are going to use Terraform and create an infrastructure template for GKE clusters.

 

 ‍

Prerequisites

Before starting, you should have a valid Google Cloud account. The second step is activating the Kubernetes Engine API by selecting or creating a project. Make sure that you have a billing account linked to your project.

Once the API is activated, which can take a few minutes, you should install the Google Cloud SDK.

After installing the SDK, we need to set the project using Cloud Shell:

 

gcloud config set project <project-id>

 ‍

Set a compute zone:

 

gcloud config set compute/zone <compute-zone>

‍ 

Note that you can get a list of available zones using:

 

gcloud config set compute/zone compute-zone

‍ 

Now you can test creating a cluster using:

 

gcloud container clusters create <cluster_name>

 

Terraform interacts with Google Cloud Platform API. A good practice here is creating a Service Account that will be used by only Terraform. This will give us more control and makes managing security more flexible.

In the Cloud Console, click on "IAM & Admin" -> "Service Accounts", and click on "Create a Service Account".

Give the Service Account a name, and give it the role "Project Editor". You will be asked to generate and download a JSON key for this account, do it and save it to:

 

<project>/auth/serviceaccount.json

‍ 

<project> is your project folder where you will create the Terraform template. You can also add a .gitignore file to ignore the credentials as well as other unused files:‍

 

auth/* 
.terraform/*

 

We additionally need to install Terraform. It's a binary package, so there is nothing complicated. You need to download the binary and make it executable. To download Terraform, use the official download page, select your OS and download the binary.

 

wget https://releases.hashicorp.com/terraform/0.12.18/terraform_0.12.18_linux_amd64.zip
unzip terraform_0.12.18_linux_amd64.zip
sudo mv terraform /usr/bin/terraform
sudo chmod +x /usr/bin/terraform

 ‍

Creating our First GKE Cluster Using Terraform

Terraform Providers

Terraform can interact with many cloud providers like AWS, Azure, and GCP. For each cloud, Terraform needs a kind of a driver to interface with the cloud API for authentication and management. This "driver" is called Provider.

Let's create a provider for GCP. Create "provider.tf" file and paste the following code:

 

provider "google" {
  credentials = "${file("./auth/serviceaccount.json")}"
  project     = "<your_project_name>"
  region      = "<your_region>"
}

 

Make sure that you save the previously generated JSON key to "auth/serviceaccout.json", then launch initialization using: 

 

terraform init

‍ 

This should initialize the project and download GCP provider files.

 

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google" (hashicorp/google) 3.2.0...

‍ 

Terraform Resources

Let's move to the creation of the virtual resources of our infrastructure. Using Terraform you can define the names and the configurations of the cloud resources you want to create. In our case, it will be a GKE cluster.

Create a file called "gke.tf" and paste the following code:

 

resource "google_container_cluster" "primary" {
  name     = "<cluster_name>"
  network            = "default"
  location               = "<location>"
  initial_node_count = 1
  }

‍ 

Make sure to change <cluster_name> and <location> to real values.

e.g.:

 

resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  network            = "default"
  location               = "europe-west1"
  initial_node_count = 1
  }

‍ 

Terraform Plan

Terraform has a declarative DSL. This means that you only describe the desired state in a "tf" file. Terraform is responsible for achieving it. In other words, you don't need to describe step by step the execution model to create the desired infrastructure.

When you use the command "terraform plan", Terraform creates an execution plan. It will compare the state of the resources before creating the plan, and the desired state.

‍ 

Terraform Apply

After executing the plan command, Terraform generated a file describing the execution plan (local file). This allows the apply command to know about the changes to apply to the resources, namely "my-gke-cluster".

Since this is the first time we create the cluster, there is nothing to update, everything will be created for the first time.

Let's run the apply command to create the cluster:

 

terraform apply

‍ 

You should be able to see the different configurations related to the GKE cluster.

 

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

 ‍

Updating Resources with Terraform

In the previous part of this tutorial, we created a Kubernetes cluster using a simple Terraform template. To update a cluster, you can use Terraform too. Add or update the configurations you want in the cluster desired state using the same "tf" file ("gke.tf").

Say we want to add a node pool of preemptible nodes, with one "n1-standard-1" node, and enable the Stackdriver service.

We need to update our code by adding:

 

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  location   = "europe-west1"
  cluster    = google_container_cluster.primary.name
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "n1-standard-1"

    metadata = {
      disable-legacy-endpoints = "true"
    }

    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }
}

‍ 

This is our final "gke.tf" file:

 

resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  network            = "default"
  location               = "europe-west1"
  initial_node_count = 1
  }

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  location   = "europe-west1"
  cluster    = google_container_cluster.primary.name
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "n1-standard-1"

    metadata = {
      disable-legacy-endpoints = "true"
    }

    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }
}

‍ 

Note that instead of typing:

 

..
"primary_preemptible_nodes" {
  name       = "my-node-pool"
  location   = "europe-west1"
  cluster    = "my-gke-cluster"
  node_count = 1
..

‍ 

We employed Terraform variables and used:

 

..
"primary_preemptible_nodes" {
  name       = "my-node-pool"
  location   = "europe-west1"
  cluster    = google_container_cluster.primary.name
  node_count = 1
..

 ‍

Destroying a Cluster Using Terraform

We used Terraform to create a cluster. The same tool provides a feature to destroy the resources based on their state. When you destroy a cluster using Terraform, you will also update its state.

To destroy the cluster, you should use:

 

terraform destroy

‍ 

Conclusion

In this tutorial, we have seen how to make creating a GKE cluster an easy and especially a reproducible task.

There are customizations you can add to your GKE cluster; some of them add hot updates without recreating a new cluster, and others require destroying and recreating a new cluster.

Combining the power of IaC and cloud computing is a good way to create a self-service for your developers. Adding a layer of configuration management to your template files allows you to have more control over your self-service infrastructure.

If you're interested in using Prometheus to monitor your Kubernetes, check out our article HA Kubernetes Monitoring using Prometheus and Thanos for further reading. Also, jump onto our MetricFire free trial and start monitoring with Prometheus. Feel free to book a demo and speak with us directly about your monitoring needs.

Hungry for more knowledge?

Related posts