Kubernetes on AWS Resources

November 22, 2019

Table of Contents

1. Why is Managing Access to AWS Services a Problem?

2. Diving into Implementation with Kube2iam

          2.1. Overall Architecture

          2.2. Implementation

                    2.2.1. Creating and Attaching IAM Roles

                    2.2.2. Deploying Kube2iam

                    2.2.3. Testing from a Test Pod

3. Kiam

          3.1. Overall Architecture

         3.2. Implementation

                    3.2.1. Creating and Attaching IAM Roles

                    3.2.2. Deploying Cert Manager and Generating Certificates

                    3.2.3. Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

                    3.2.4. Annotating Resources

                    3.2.5. Deploying Kiam Agent and Server

                              3.2.5.1. Kiam Server

                              3.2.5.2. Kiam Agent

                              3.2.5.3. Testing

4. IAM Roles for Service Accounts (IRSA)

5. Conclusion

Introduction

Kubernetes is an open-source container orchestration system that allows you to get the most out of your machine. Using Kubernetes, however, raises the problem of managing access for pods to various Amazon Web Services (AWS). This article covers how to overcome these problems by using specific tools. Here’s how we’ve organized the information:

  • Why managing access can be a problem 
  •  Managing access through Kube2iam
  •  Managing access through KIAM
  •  IAM Roles for Service Accounts (IRSA)

1. Why is Managing Access to AWS Services a Problem? 

Imagine this: A Kubernetes node is hosting an application pod that needs access to AWS DynamoDB tables. Meanwhile, another pod on the same node needs access to an AWS S3 bucket. For both applications to work properly, the Kubernetes worker node must access both the DynamoDB tables and the S3 bucket at the same time.

Now think about this happening to hundreds of pods, all requiring access to various AWS resources. The pods are constantly being scheduled on a Kubernetes cluster that needs to access several different AWS services simultaneously… It’s a lot!

One way to solve this would be to give the Kubernetes node—and therefore the pods—access to all AWS resources. However, this leaves your system an easy target for any potential attacker: if a single pod or node is compromised, an attacker will gain access to your entire AWS infrastructure. To avoid this, you can use tools like Kube2iam, Kiam, and IAM IRSA to improve access from the Kubernetes pods into the AWS resources. The best part? All the access API calls and authentication metrics can be pulled by Prometheus and visualized in Grafana.

2. Diving into Implementation with Kube2iam

2.1. Overall Architecture

Kube2iam is deployed as a DaemonSet in your cluster. Therefore, a pod of Kube2iam will be scheduled to run on every worker node of your Kubernetes cluster. Whenever a different pod makes an AWS API call to access resources, that call will be intercepted by the Kube2iam pod running on that node. Kube2iam then ensures that the pod is assigned appropriate credentials to access the resource. 

You also need to specify an Identity and Access Management (IAM) role in the pod spec. Under the hood, the Kube2iam pod will retrieve temporary credentials for the IAM role of the caller and return them to said caller. Basically, all the Amazon Elastic Compute Cloud (EC2) metadata API calls are made into a proxy. (A Kube2iam pod should run with host networking enabled so that it can make the EC2 metadata API calls.) 

 

2.2. Implementation 

2.2.1. Creating and Attaching IAM Roles 

  1. Create an IAM role named my-role which has access to the required AWS resources (for example, an AWS S3 bucket).
  2. Follow these steps to enable trust relationship between the role and the role attached to the Kubernetes worker nodes. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kube2iam, so the worker node IAM roles do not need access to a large number of AWS Resources.)

            a. Go to the newly created role in AWS console and select the ‘Trust relationships’ tab

            b. Click ‘Edit trust relationship’

            c. Add the following content to the policy:

<p>CODE:https://gist.github.com/denshirenji/80c5066bf0ebb8ca5495a4cdc7977144.js</p>

       

          d. Enable ‘Assume role’ for Node Pool IAM roles. Add the following content to Nodes IAM policy:

<p>CODE:https://gist.github.com/denshirenji/ec9f6f8c6aa635a23642329f604db673.js</p>


       3. Add the IAM role's name to deployment as an annotation. 

<p>CODE:https://gist.github.com/denshirenji/3107648076de1d366194beec04eae439.js</p>

 

2.2.2. Deploying Kube2iam

  1. Create the service account, ClusterRole and ClusterRoleBinding to be used by Kube2iam pods. The ClusterRole should have 'get', 'watch' and 'list' access to namespaces and pods under all API groups. You can use the manifest below to create them:

<p>CODE:https://gist.github.com/denshirenji/b72868394a940b54ced7c5ce285591af.js</p>

        2. Deploy the Kube2iam DaemonSet by using the manifest below: 

<p>CODE:https://gist.github.com/denshirenji/7f531008f1a215e6e6035e1381177486.js</p>

Note: The Kube2iam container is being run with the arguments --iptables=true and --host-ip=$(HOST_IP), and in privileged mode as true. 

<p>CODE:https://gist.github.com/denshirenji/00aa2fe6bf04c5db82a1f3337e86deb9.js</p>


The following settings prevent containers running in other pods from directly accessing the EC2 metadata API and gaining unwanted access to AWS resources. The traffic to 169.254.169.254 must be made into a proxy for Docker containers. This can be alternatively applied by running the following command on each Kubernetes worker node:  

<p>CODE:https://gist.github.com/denshirenji/5f504238d23cdaa95d54110cf65f8e73.js</p>


2.2.3. Testing Access from a Test Pod

To check whether your Kube2iam deployment and IAM settings work, you can deploy a test pod with an IAM role specified as an annotation. If everything works, you should be able to check which IAM node gets attached to your pod. This can be easily verified by querying the EC2 Metadata API. Let’s deploy a test pod using the manifest below: 

<p>CODE:https://gist.github.com/denshirenji/4214aad3d62d79428c77d132ea50c865.js</p>

Run the following command in the test pod created:  

<p>CODE:https://gist.github.com/denshirenji/5d937db317d875a5a5f90ff241c9a0f8.js</p>

You should get myrole as the response to this API. 

I highly recommend tailing the logs of the Kube2iam pod running on that node in order to gain a deeper understanding of how and when the API calls are being intercepted. Once the setup works as expected, you should turn off verbosity in the Kube2iam deployment in order to avoid bombarding your logging backend.

3. Kiam

While very helpful, Kube2iam has two major issues that Kiam aims to resolve:

  • Data races under load condition: When you have very high spikes in application load and there are several pods in the cluster, sometimes Kube2iam returns incorrect credentials to those pods. The GitHub issue can be referenced here
  • Pre-fetch credentials: Access credentials are assigned to the IAM role specified in the pod spec before the container processes boots in the pod. By assigning the credentials before, Kiam reduces start latency and improves reliability.

Additional features of Kiam include:

  • Use of structured logging to improve the integration into your Elacsticsearch, Logstash, Kibana (ELK) setup with pod names, roles, access key IDs, etc.
  • Use of metrics to track response times, cache hit rates, etc. These metrics can be readily scraped by Prometheus and rendered over Grafana. 


3.1. Overall Architecture

Kiam is based on agent-server architecture. 

  • Kiam Agent: This is the process that would typically be deployed as a DaemonSet to ensure that pods have no access to the AWS Metadata API. Instead, the Kiam agent runs an HTTP proxy which intercepts credentials requests and passes on everything else. 
  • Kiam Server: This process is responsible for connecting the Kubernetes API servers to watch pods, and for communicating with AWS Security Token Service (STS) to request credentials. It also maintains a cache of credentials for roles currently in use by running pods, ensuring that credentials are refreshed every few minutes and stored before the pods need them.


3.2. Implementation

Similar to Kube2iam, in order for a pod to get credentials for any IAM role, that role should be specified as an annotation in the deployment manifest. Additionally, you need to specify which IAM roles can be allocated inside a particular namespace using appropriate annotations. This enhances security and lets you fine tune control of IAM roles. 


3.2.1. Creating and Attaching IAM Roles

1. Create an IAM role named kiam-server with appropriate access to AWS resources.

2. Enable trust relationship between the kiam-server role and the role attached to Kubernetes master nodes by following these steps. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kiam. The worker node IAM roles do not need access to many AWS resources.)

     a. Go to the newly created role in AWS console and select the ‘Trust relationships’ tab. 

     b. Click on ‘Edit trust relationship’.

     c. Add the following content to the policy:

<p>CODE:https://gist.github.com/denshirenji/ea730c63ce769b7a689cef0f58f60ad9.js</p>


3.    Add in-line policy to the kiam-server role.

 <p>CODE:https://gist.github.com/denshirenji/4f7deefd2eacc5ac567bd0b1a2d8e8c8.js</p>


4.   Create the IAM role (let's call it my-role) with appropriate access to AWS resources.

5.   Enable trust relationship between the newly created role and the Kiam server role. 

     To do so: 

     a. Go to the newly created role in AWS console and select ‘Trust relationships’ 

     b. Click ‘Edit trust relationship’

     c. Add the following content to the policy:

<p>CODE:https://gist.github.com/denshirenji/442f55f7918e0e89ee48c5e91f5bf6bb.js</p>


6.    Enable ‘Assume Role’ for Master Pool IAM roles. Add the following content as in-line policy to master IAM roles:

<p>CODE:https://gist.github.com/denshirenji/c343fd2543c20b1715bf4dd019819cd4.js</p>

All the communication between Kiam agent and server is TLS encrypted. This enhances security. To do this we need to first deploy cert-manager in our Kubernetes cluster and generate certificates for our agent-server communication.

3.2.2. Deploying Cert Manager and Generating Certificates

1.   Install the custom resource definition resources separately.

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml


2.   Create the namespace for cert-manager.

kubectl create namespace cert-manager


3.   Label the cert-manager namespace to disable resource validation.

kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true

4.   Add the Jetstack Helm repository.

helm repo add jetstack https://charts.jetstack.io


5.   Update your local Helm chart repository cache.

helm repo update


6.   Install the cert-manager Helm chart.

helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager


3.2.3. Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

1.   Generate the CRT file.

<p>CODE:https://gist.github.com/denshirenji/b6c515837da077fa52981cc60c95d492.js</p>


2.   Save the CA key pair as a secret in Kubernetes.

<p>CODE:https://gist.github.com/denshirenji/bdfbe133e94b272f723e2eca8ed8acf1.js</p>


3.   Deploy cluster issuer and issue the certificate.

     a.    Create the Kiam namespace. 

<p>CODE:https://gist.github.com/denshirenji/403ffbc5ef27f469276ca90995cbf1f4.js</p>


     b.   Deploy the cluster issuer and issue the certificate.

<p>CODE:https://gist.github.com/denshirenji/ac299bd11a3d30e173af749a99c35992.js</p>


4.   Test if certificates are issued correctly.

<p>CODE:https://gist.github.com/denshirenji/826a13cffee05407a4cb440a5c636618.js</p>


3.2.4. Annotating Resources

1.    Add the IAM role’s name to deployment as an annotation. 

<p>CODE:https://gist.github.com/denshirenji/7a97210d19f323b20511dd971d9f8eb0.js</p>

2.    Add role annotation to the namespace in which the pods will run. You don’t need to do this with Kube2iam. 

<p>CODE:https://gist.github.com/denshirenji/0318a496d16decbacc2fcbb004171b88.js</p>


The default is not to allow any roles. You can use a regex as shown above to allow all roles or can even specify a particular role per namespace.


3.2.5. Deploying Kiam Agent and Server

3.2.5.1. Kiam Server

The manifest below deploys the following:

  1. Kiam Server DaemonSet which will run on Kubernetes master nodes (configure to use the TLS secret created above)
  2. Kiam Server service
  3. Service account, ClusterRole and ClusterRoleBinding required by Kiam server

<p>CODE:https://gist.github.com/denshirenji/8927fb1b23c5c6b5f15b4342e9a138bb.js</p>

Note

  1. The scheduler toleration and node selector that we have in place here make sure that the Kiam pods get scheduled on Kiam master nodes only. This is why we enable the trust relationship between the Kiam-server IAM role and the IAM role attached to the Kubernetes master nodes (above). 

<p>CODE:https://gist.github.com/denshirenji/863222583c4f78b0a18e381615db56c9.js</p>


<p>CODE:https://gist.github.com/denshirenji/da5d512609629553605be3e89bed3f16.js</p>

  1. The kiam-server role ARN is provided as an argument to the Kiam server container. Make sure you update the <KIAM_SERVER_ROLE_ARN> field in the manifest above to the ARN of the role you created. 
  2. The ClusterRole and ClusterRoleBinding created for a Kiam server grant it the minimal permissions required to operate effectively. Please consider thoroughly before changing them. 
  3. Make sure the path to SSL Certs is set correctly according to the secret you created using cert-manager certificates. This is important to establish secure communication between the Kiam server and Kiam agent pods. 


3.2.5.2. Kiam Agent

The manifest provided below will deploy the following Kiam Agent DaemonSet which will run on Kubernetes worker nodes only:

<p>CODE:https://gist.github.com/denshirenji/7aca7af91da20156951fde25ab18f355.js</p>


It should be noted that Kiam agent also runs with host networking set to true, similar to Kube2iam. Also, one of the arguments to the Kiam agent’s container is the name of the Kiam service to access Kiam server, in this case  kiam-server:443 Therefore, we should deploy the Kiam server before deploying the Kiam agent. 

Also, the container argument --gateway-timeout-creation defines the waiting period for the Kiam server pod to be up before the agent tries to connect. It can be tweaked depending on how long the pod takes to come up in your Kubernetes cluster. Ideally, a thirty second waiting period is enough.

3.2.5.3. Testing

The processes for testing the Kiam and Kube2iam setups are the same. You can use a test pod and curl the metadata to check the assigned role. Please ensure that both deployment and namespace are properly annotated.

4. IAM Roles for Service Accounts (IRSA)

Recently, AWS released its own service to allow pods to access AWS resources: IAM roles for Service Accounts (IRSA). Since a role is authenticated with a service account, it can be shared by all pods to which that service account gets attached. This service is available in both AWS EKS and a KOPS based installation. You can read more about it here.

5. Conclusion

The tools covered in this blog help manage access from Kubernetes pods to AWS resources, and they all have their pros and cons.

While Kube2IAM is the easiest to implement, the ease of setup compromises efficiency: Kube2iam might not perform reliably under high load conditions. It is more suited for non-production environments or scenarios that don’t experience major traffic surges.

IAM IRSA requires more work than Kube2iam, but given Amazon’s detailed documentation, it may be easier to implement. Since it is so recent, there is not enough implementation of IRSA in the industry at the time this article was written.

KIAM’s implementation needs cert-manager running and, unlike with Kube2iam, you need to annotate the namespace along with the deployment. Regardless, we highly recommend using Kiam because it can be used in all cases provided you have the resources for running cert manager, and that your master nodes are equipped to handle a DaemonSet running on them. By using the manifests provided in this post, your set-up will be seamless and production ready. Feel free to reach out if you have any questions!

This article was written by our guest blogger Vaibhav Thakur. If you liked this article, check out his LinkedIn for more.


Related Posts

GET FREE PROMETHEUS monitoring FOR 14 Days