2.1 Architecture Overview
2.3 Deploying the Sample Application
2.4 Create SSL Certs and the Kubernetes Secret for Prometheus Adapter
2.5 Create Prometheus Adapter ConfigMap
2.6. Create Prometheus Adapter Deployment
2.7. Creating Prometheus Adapter API Service
3. Testing the Set-Up
One of the major advantages of using Kubernetes for container orchestration is that it makes it really easy to scale our application horizontally and account for increased load. Natively, horizontal pod autoscalers can scale the deployment based on CPU and Memory usage but in more complex scenarios we would want to account for other metrics before making scaling decisions.
Enter Prometheus Adapter. Prometheus is the standard tool for monitoring deployed workloads and the Kubernetes cluster itself. Prometheus Adapter helps us to leverage the metrics collected by Prometheus and use them to make scaling decisions. These metrics are exposed by an API service and can be readily used by our Horizontal Pod Autoscaling object.
We will be using Prometheus Adapter to pull custom metrics from our Prometheus installation and then let the horizontal pod autoscaler use it to scale the pods up or down. The Prometheus Adapter will be running as deployment exposed using a service in our cluster. Generally, a single replica of the Adapter is enough for small to medium sized clusters. However, if you have a very large cluster then you can run multiple replicas of Prometheus Adapter distributed across nodes using Node Affinities and Pod-AntiAffinity properties.
We will be using a Prometheus-Thanos Highly Available deployment. More about it can be read here.
Let’s first deploy a sample app over which we will be testing our Prometheus metrics autoscaling. We can use the manifest below to do it:
This will create a namespace named nginx and deploy a sample nginx application in it. The application can be accessed using the service and also exposes nginx vts metrics at the endpoint /status/format/prometheus over port 80. For the sake of our setup we have created a dns entry for the ExternalIP which maps to nginx.gotham.com.
These are all the metrics currently exposed by the application:
Among these we are particularly interested in nginx_vts_server_requests_total. We will be using the value of this metric to determine whether or not to scale our nginx deployment.
We can use the Makefile below to generate openssl certs and corresponding Kubernetes secret:
Once you have created the make file, just run the following command:
and it will create ssl certificates and the corresponding Kubernetes secret for you. Make sure that monitoring namespace exists before you create the secret. This secret will be using the Prometheus Adapter which we will deploy next.
Use the manifest below to create the Prometheus Adapter configmap:
This config map only specifies a single metric. However, we can always add more metrics. You can refer to this link to add more metrics. It is highly recommended to fetch only those metrics which we need for the horizontal pod autoscaler. This helps in debugging and also these add-ons generate very verbose logs which get ingested by our logging backend. Fetching metrics which are not needed will not only load the service but also spam the logging backend with un-necessary logs. If you want to learn in detail about the config map, please read here.
Use the following manifest to deploy Prometheus Adapter:
This will create our deployment which will spawn the Prometheus Adapter pod to pull metrics from Prometheus. It should be noted that we have set the argument --prometheus-url=http://thanos-querier.monitoring:9090/. This is because we have deployed a Thanos backed Prometheus cluster in the monitoring namespace in the same Kubernetes cluster as Prometheus Adapter. You can change this argument to point to your Prometheus deployment.
If you notice the logs for this container, you can see that it is fetching the metric defined in the config file:
The manifest below will create an API service so that our Prometheus Adapter is accessible by Kubernetes API and thus metrics can be fetched by our Horizontal Pod Autoscaler.
Let’s check all of the custom metrics that are available:
We can see that nginx_vts_server_requests_per_second metric is available.
Now, let’s check the current value of this metric:
Create an HPA which will utilize these metrics. We can use the manifest below to do it:
Once you have applied this manifest, you can check the current status of HPA as follows:
Now, let's generate some load on our service. We will be using a utility called vegeta for this.
In a separate terminal run this following command:
If you simultaneously monitor the nginx pods and the horizontal pod autoscaler, you should see something like this:
It can be clearly seen that the horizontal pod autoscaler scaled up our pods to meet the needs, and when we interrupt the vegeta command we can see the vegeta report. It clearly shows that all our requests were served by the application.
This set-up demonstrates how we can use Prometheus Adapter to autoscale deployments based on some custom metrics. For the sake of simplicity we have only fetched one metric from our Prometheus Server. However, the Adapter configmap can be extended to fetch some or all the available metrics and use them for autoscaling.
If the Prometheus installation is outside of our Kubernetes cluster, we just need to make sure that the query end-point is accessible from the cluster and update it in the Adapter deployment manifest. We can have complex scenarios where multiple metrics can be fetched and used in combination to make scaling decisions.
Feel free to reach out should you have any questions around the setup and I would be happy to assist you.
This article was written by our guest blogger Vaibhav Thakur. If you liked this article, check out his LinkedIn for more.