Kubernetes on AWS Resources

KUBERNETES

Nov 22, 2019 ∙ 17 min read

MetricFire Blogger

Introduction
Why is Managing Access to AWS Services a Problem?
Diving into Implementation with Kube2iam
- Overall Architecture
- Implementation
Kiam
- Overall Architecture
- Implementation
IAM Roles for Service Accounts (IRSA)
Conclusion

Introduction

Kubernetes is an open-source container orchestration system that allows you to get the most out of your machine. Using Kubernetes, however, raises the problem of managing access for pods to various Amazon Web Services (AWS). This article covers how to overcome these problems by using specific tools. Here’s how we’ve organized the information:

Why managing access can be a problem
Managing access through Kube2iam
Managing access through KIAM
IAM Roles for Service Accounts (IRSA)

‍

Why is Managing Access to AWS Services a Problem?

Imagine this: A Kubernetes node is hosting an application pod that needs access to AWS DynamoDB tables. Meanwhile, another pod on the same node needs access to an AWS S3 bucket. For both applications to work properly, the Kubernetes worker node must access both the DynamoDB tables and the S3 bucket at the same time.

Now think about this happening to hundreds of pods, all requiring access to various AWS resources. The pods are constantly being scheduled on a Kubernetes cluster that needs to access several different AWS services simultaneously… It’s a lot!

One way to solve this would be to give the Kubernetes node—and therefore the pods—access to all AWS resources. However, this leaves your system an easy target for any potential attacker: if a single pod or node is compromised, an attacker will gain access to your entire AWS infrastructure. To avoid this, you can use tools like Kube2iam, Kiam, and IAM IRSA to improve access from the Kubernetes pods into the AWS resources. The best part? All the access API calls and authentication metrics can be pulled by Prometheus and visualized in Grafana. If you want to try the Prometheus/Grafana part for yourself, get onto our MetricFire free trial and start sending your data.

‍

Diving into Implementation with Kube2iam

Overall Architecture

Kube2iam is deployed as a DaemonSet in your cluster. Therefore, a pod of Kube2iam will be scheduled to run on every worker node of your Kubernetes cluster. Whenever a different pod makes an AWS API call to access resources, that call will be intercepted by the Kube2iam pod running on that node. Kube2iam then ensures that the pod is assigned appropriate credentials to access the resource.

You also need to specify an Identity and Access Management (IAM) role in the pod spec. Under the hood, the Kube2iam pod will retrieve temporary credentials for the IAM role of the caller and return them to said caller. Basically, all the Amazon Elastic Compute Cloud (EC2) metadata API calls are made into a proxy. (A Kube2iam pod should run with host networking enabled so that it can make the EC2 metadata API calls.)

Implementation

Creating and Attaching IAM Roles

Create an IAM role named my-role which has access to the required AWS resources (for example, an AWS S3 bucket).
Follow these steps to enable trust relationship between the role and the role attached to the Kubernetes worker nodes. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kube2iam, so the worker node IAM roles do not need access to a large number of AWS Resources.)

a. Go to the newly created role in AWS console and select the ‘Trust relationships’ tab

b. Click ‘Edit trust relationship’

c. Add the following content to the policy:

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KUBERNETES_NODES_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

d. Enable ‘Assume role’ for Node Pool IAM roles. Add the following content to Nodes IAM policy:

{
        "Sid": "",
    "Effect": "Allow",
    "Action": [
    	"sts:AssumeRole"
    ],
    "Resource": [
        "arn:aws:iam::810085094893:instance-profile/*"
    ]
}

3. Add the IAM role's name to deployment as an annotation.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mydeployment
  namespace: default
spec:
...
  minReadySeconds: 5
  template:
      annotations:
        iam.amazonaws.com/role: my-role
    spec:
      containers:
...

Deploying Kube2iam

Create the service account, ClusterRole and ClusterRoleBinding to be used by Kube2iam pods. The ClusterRole should have 'get', 'watch' and 'list' access to namespaces and pods under all API groups. You can use the manifest below to create them:

‍

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube2iam
  namespace: kube-system
---
apiVersion: v1
kind: List
items:
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRole
    metadata:
      name: kube2iam
    rules:
      - apiGroups: [""]
        resources: ["namespaces","pods"]
        verbs: ["get","watch","list"]
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: kube2iam
    subjects:
    - kind: ServiceAccount
      name: kube2iam
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: kube2iam
      apiGroup: rbac.authorization.k8s.io
---

‍

2. Deploy the Kube2iam DaemonSet by using the manifest below:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
  namespace: kube-system
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      serviceAccount: kube2iam
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--auto-discover-base-arn"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=cali+"
            - "--verbose"
            - "--debug"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true
---

‍

Note: The Kube2iam container is being run with the arguments --iptables=true and --host-ip=$(HOST_IP), and in privileged mode as true.

‍

...
    securityContext:
        privileged: true
...

‍

The following settings prevent containers running in other pods from directly accessing the EC2 metadata API and gaining unwanted access to AWS resources. The traffic to 169.254.169.254 must be made into a proxy for Docker containers. This can be alternatively applied by running the following command on each Kubernetes worker node:

‍

iptables \
  --append PREROUTING \
  --protocol tcp \
  --destination 169.254.169.254 \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
--to-destination `curl 169.254.169.254/latest/meta-data/local-ipv4`:8181

Testing Access from a Test Pod

To check whether your Kube2iam deployment and IAM settings work, you can deploy a test pod with an IAM role specified as an annotation. If everything works, you should be able to check which IAM node gets attached to your pod. This can be easily verified by querying the EC2 Metadata API. Let’s deploy a test pod using the manifest below:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: access-test
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: access-test
  minReadySeconds: 5
  template:
    metadata:
      labels:
        app: access-test
      annotations:
        iam.amazonaws.com/role: my-role
    spec:
      containers:
      - name: access-test
        image: "iotapi322/worker:v4"

‍

Run the following command in the test pod created:

‍

curl 169.254.169.254/latest/meta-data/iam/security-credentials/

‍

You should get myrole as the response to this API.

I highly recommend tailing the logs of the Kube2iam pod running on that node in order to gain a deeper understanding of how and when the API calls are being intercepted. Once the setup works as expected, you should turn off verbosity in the Kube2iam deployment in order to avoid bombarding your logging backend.

‍

Kiam

While very helpful, Kube2iam has two major issues that Kiam aims to resolve:

Data races under load condition: When you have very high spikes in application load and there are several pods in the cluster, sometimes Kube2iam returns incorrect credentials to those pods. The GitHub issue can be referenced here.
Pre-fetch credentials: Access credentials are assigned to the IAM role specified in the pod spec before the container processes boots in the pod. By assigning the credentials before, Kiam reduces start latency and improves reliability.

Additional features of Kiam include:

Use of structured logging to improve the integration into your Elacsticsearch, Logstash, Kibana (ELK) setup with pod names, roles, access key IDs, etc.
Use of metrics to track response times, cache hit rates, etc. These metrics can be readily scraped by Prometheus and rendered over Grafana.

Overall Architecture

Kiam is based on agent-server architecture.

Kiam Agent: This is the process that would typically be deployed as a DaemonSet to ensure that pods have no access to the AWS Metadata API. Instead, the Kiam agent runs an HTTP proxy which intercepts credentials requests and passes on everything else.
Kiam Server: This process is responsible for connecting the Kubernetes API servers to watch pods, and for communicating with AWS Security Token Service (STS) to request credentials. It also maintains a cache of credentials for roles currently in use by running pods, ensuring that credentials are refreshed every few minutes and stored before the pods need them.

Implementation

Similar to Kube2iam, in order for a pod to get credentials for any IAM role, that role should be specified as an annotation in the deployment manifest. Additionally, you need to specify which IAM roles can be allocated inside a particular namespace using appropriate annotations. This enhances security and lets you fine tune control of IAM roles.

Creating and Attaching IAM Roles

1. Create an IAM role named kiam-server with appropriate access to AWS resources.

2. Enable trust relationship between the kiam-server role and the role attached to Kubernetes master nodes by following these steps. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kiam. The worker node IAM roles do not need access to many AWS resources.)

a. Go to the newly created role in AWS console and select the ‘Trust relationships’ tab.

b. Click on ‘Edit trust relationship’.

c. Add the following content to the policy:

‍

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KUBERNETES_MASTER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

3. Add in-line policy to the kiam-server role.

‍

{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
       ],
       "Resource": "*"
   }
 ]
}

4. Create the IAM role (let's call it my-role) with appropriate access to AWS resources.

5. Enable trust relationship between the newly created role and the Kiam server role.

To do so:

a. Go to the newly created role in AWS console and select ‘Trust relationships’

b. Click ‘Edit trust relationship’

c. Add the following content to the policy:

‍

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KIAM-SERVER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

6. Enable ‘Assume Role’ for Master Pool IAM roles. Add the following content as in-line policy to master IAM roles:

‍

{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
       ],
       "Resource": "<ARN_KIAM-SERVER_IAM_ROLE>"
   }
 ]

‍

All the communication between Kiam agent and server is TLS encrypted. This enhances security. To do this we need to first deploy cert-manager in our Kubernetes cluster and generate certificates for our agent-server communication.

‍

Deploying Cert Manager and Generating Certificates

1. Install the custom resource definition resources separately.

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml

2. Create the namespace for cert-manager.

kubectl create namespace cert-manager

3. Label the cert-manager namespace to disable resource validation.

kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true

4. Add the Jetstack Helm repository.

helm repo add jetstack https://charts.jetstack.io

5. Update your local Helm chart repository cache.

helm repo update

6. Install the cert-manager Helm chart.

helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager

Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

1. Generate the CRT file.

‍

openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=kiam" -out kiam.cert -days 3650 -reqexts v3_req -extensions v3_ca -out ca.crt

2. Save the CA key pair as a secret in Kubernetes.

‍

kubectl create secret tls kiam-ca-key-pair \
  --cert=ca.crt \
  --key=ca.key \
  --namespace=cert-manager

3. Deploy cluster issuer and issue the certificate.

a. Create the Kiam namespace.

‍

apiVersion: v1
kind: Namespace
metadata:
  name: kiam
  annotations:
    iam.amazonaws.com/permitted: ".*"
---

‍

b. Deploy the cluster issuer and issue the certificate.

‍

apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: kiam-ca-issuer
  namespace: kiam
spec:
  ca:
    secretName: kiam-ca-key-pair
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-agent
  namespace: kiam
spec:
  secretName: kiam-agent-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-server
  namespace: kiam
spec:
  secretName: kiam-server-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
  dnsNames:
  - kiam-server
  - kiam-server:443
  - localhost
  - localhost:443
  - localhost:9610
---

‍

4. Test if certificates are issued correctly.

‍

kubectl -n kiam get secret kiam-agent-tls -o yaml
kubectl -n kiam get secret kiam-server-tls -o yaml

‍

Annotating Resources

1. Add the IAM role’s name to deployment as an annotation.

‍

apiVersion: extensions/v1beta1
 kind: Deployment
 metadata:
 	name: mydeployment
 	namespace: default
 spec:
 ...
 	minReadySeconds: 5
 	template:
     	annotations:
       	iam.amazonaws.com/role: my-role
   	spec:
     	containers:
 ...

‍

2. Add role annotation to the namespace in which the pods will run. You don’t need to do this with Kube2iam.

‍

apiVersion: v1
 kind: Namespace
 metadata:
 	name: default
 	annotations:
 		iam.amazonaws.com/permitted: ".*"

The default is not to allow any roles. You can use a regex as shown above to allow all roles or can even specify a particular role per namespace.

Deploying Kiam Agent and Server

Kiam Server

The manifest below deploys the following:

Kiam Server DaemonSet which will run on Kubernetes master nodes (configure to use the TLS secret created above)
Kiam Server service
Service account, ClusterRole and ClusterRoleBinding required by Kiam server

‍

---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-read
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  verbs:
  - watch
  - get
  - list
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-read
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-write
rules:
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-write
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-write
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-server
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kiam
        role: server
    spec:
      tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule
      serviceAccountName: kiam-server
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
      nodeSelector:
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --level=info
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --role-base-arn-autodetect
            - --assume-role-arn=<KIAM_SERVER_ROLE_ARN>
            - --sync=1m
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: kiam-server
  namespace: kiam
spec:
  clusterIP: None
  selector:
    app: kiam
    role: server
  ports:
  - name: grpclb
    port: 443
    targetPort: 443
    protocol: TCP

‍

Note:

The scheduler toleration and node selector that we have in place here make sure that the Kiam pods get scheduled on Kiam master nodes only. This is why we enable the trust relationship between the Kiam-server IAM role and the IAM role attached to the Kubernetes master nodes (above).

...
       tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule 
...

...
      nodeSelector:
        kubernetes.io/role: master
....

The kiam-server role ARN is provided as an argument to the Kiam server container. Make sure you update the <KIAM_SERVER_ROLE_ARN> field in the manifest above to the ARN of the role you created.
The ClusterRole and ClusterRoleBinding created for a Kiam server grant it the minimal permissions required to operate effectively. Please consider thoroughly before changing them.
Make sure the path to SSL Certs is set correctly according to the secret you created using cert-manager certificates. This is important to establish secure communication between the Kiam server and Kiam agent pods.

Kiam Agent

The manifest provided below will deploy the following Kiam Agent DaemonSet which will run on Kubernetes worker nodes only:

‍

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-agent
spec:
  template:
    metadata:
      labels:
        app: kiam
        role: agent
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-agent-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=cali+
            - --json-log
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --gateway-timeout-creation=30s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

‍

It should be noted that Kiam agent also runs with host networking set to true, similar to Kube2iam. Also, one of the arguments to the Kiam agent’s container is the name of the Kiam service to access Kiam server, in this case kiam-server:443 Therefore, we should deploy the Kiam server before deploying the Kiam agent.

Also, the container argument --gateway-timeout-creation defines the waiting period for the Kiam server pod to be up before the agent tries to connect. It can be tweaked depending on how long the pod takes to come up in your Kubernetes cluster. Ideally, a thirty second waiting period is enough.

‍

Testing

The processes for testing the Kiam and Kube2iam setups are the same. You can use a test pod and curl the metadata to check the assigned role. Please ensure that both deployment and namespace are properly annotated.

‍

IAM Roles for Service Accounts (IRSA)

Recently, AWS released its own service to allow pods to access AWS resources: IAM roles for Service Accounts (IRSA). Since a role is authenticated with a service account, it can be shared by all pods to which that service account gets attached. This service is available in both AWS EKS and a KOPS based installation. You can read more about it here.

Conclusion

The tools covered in this blog help manage access from Kubernetes pods to AWS resources, and they all have their pros and cons.

While Kube2IAM is the easiest to implement, the ease of setup compromises efficiency: Kube2iam might not perform reliably under high load conditions. It is more suited for non-production environments or scenarios that don’t experience major traffic surges.

IAM IRSA requires more work than Kube2iam, but given Amazon’s detailed documentation, it may be easier to implement. Since it is so recent, there is not enough implementation of IRSA in the industry at the time this article was written.

KIAM’s implementation needs cert-manager running and, unlike with Kube2iam, you need to annotate the namespace along with the deployment. Regardless, we highly recommend using Kiam because it can be used in all cases provided you have the resources for running cert manager, and that your master nodes are equipped to handle a DaemonSet running on them. By using the manifests provided in this post, your set-up will be seamless and production ready.

If you want to try visualizing your metrics on Grafana dashboards that are powered by Prometheus, get on to the MetricFire free trial today. Also, feel free to sign up for a demo and talk to us directly about what monitoring solutions work for you.

This article was written by our guest blogger Vaibhav Thakur. If you liked this article, check out his LinkedIn for more.

Kubernetes on AWS Resources

Table of Contents

Introduction

Why is Managing Access to AWS Services a Problem?

Diving into Implementation with Kube2iam

Overall Architecture

Implementation

Creating and Attaching IAM Roles

Deploying Kube2iam

Testing Access from a Test Pod

Kiam

Overall Architecture

Implementation

Creating and Attaching IAM Roles

Deploying Cert Manager and Generating Certificates

Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

Annotating Resources

Deploying Kiam Agent and Server

Kiam Server

IAM Roles for Service Accounts (IRSA)

Conclusion

Logging for Kubernetes: Fluentd and ElasticSearch

Python API with Kubernetes and Docker - Part I

Tips for Monitoring Kubernetes Applications

We strive for
99.999% uptime

Try MetricFire now!