In this series, we will guide you through the most crucial container networking concepts. To apprehend the different concepts introduced here you don't need to be a Docker expert, though a basic understanding of networking, Docker, and Kubernetes is required. You can fast track to the second part by going to Docker Networking Part II.
Docker is a tool designed to create, build, and run isolated environments running inside containers. It's widely used to containerize applications so they can run inside lightweight containers. To master Docker, you must have a good understanding of how to create images, run them, secure your containers, manipulate the docker file system, and be proficient in how to manage docker networks.
Docker networking may be the most confusing part of your learning journey. During the last few years, a whole dynamic ecosystem has been developed around this technology. Technologies like Docker Compose, Docker Swarm, and Kubernetes solved many problems in the containerization ecosystem but introduced new challenges, notably in networking. A good understanding of the Docker ecosystem implies a good knowledge of networking.
When we run a container, say Wordpress, we can build an image and ship a web server (Nginx or Apache) + PHP or (PHP FPM) + a Mysql/MariaDB database.
It looks like this solution eliminates many networking problems. In this case, you can use a process manager like supervisord, which ensures that your processes are running.
However, this is not a good practice since you are going to add more layers to your images: To use supervisord, you need to install it and ship its configuration with the container. Good practice is building and running lightweight containers, including only the essential processes and software packages.
Moreover, running multiple processes in a single container is an anti-pattern. A good practice is to run a unique process inside a container.
For the Wordpress case, you should have a container for the webserver (Apache or Nginx), a container for PHP, and another container for the database. These containers must communicate with each other: The web server receives a request, sends a request to the PHP container, and if the latter needs data, it will request it from the database container. The inverse path must be taken into the equation as well. If you run these containers in different hosts they should be able to send and receive traffic from each other, even though they are not in the same host. For traffic between multiple hosts you should consider a minimum of security standards. You can also face some cases where you need to scale the webserver or the application containers and manage to route traffic to them using a load balancer.
Even if this use case seems basic, we can see how networking plays an essential role in running the whole stack.
You may run standalone containers, but in highly available environments, particularly production, you will need orchestration platforms to manage these containers.
Using orchestration systems like Kubernetes will undoubtedly solve many problems to which containers alone are not able to give satisfying solutions.
Let's identify some of them:
If you look at most of these use cases and features, you will find that networking is a common point.
Orchestration is a must, but it appends an extra layer of networking. Besides the inter container networking in the same node, there are multiple types of networking in Kubernetes like the master to cluster networking, cluster to master networking, Internet to service networking, service to pod networking, pod to pod networking, container to container networking, etc. If we want to go into more detail, we can also consider nodes, kubelet, Kube-proxy, and DNS networking.
Kubernetes is considered as the modern data center operating system; one would expect to see this networking complexity in a platform of this magnitude. At the same time, one of the most complex, and probably the most critical parts in Kubernetes, is networking. There is no mastery of Kubernetes without embracing its networking system.
There is an aphorism in the Zen of Python that says:
"Simple is better than complex. Complex is better than complicated."
In IT, complexity refers to the number of components of a system and the level of interactions between them. On the other hand, complicated means a high level of difficulty.
Containers networking, notably in orchestrated systems, is complex but not complicated. Moreover, this complexity is sometimes necessary to create abstract systems and generic solutions to common problems. This is what Joe Beda, one of the Kubernetes developers, declared:
"Kubernetes is a complex system. It does a lot and brings new abstractions [...] as engineers, we tend to discount the complexity we build ourselves vs. complexity we need to learn."
Beda went on to say that when you create a complex deployment system with Jenkins, Bash, Puppet/Chef/Salt/Ansible, AWS, Terraform, etc. you end up with a unique brand of complexity that you are comfortable with. It grew organically so it doesn't feel complex but bringing new people on to help on an organically grown system like this is difficult. They may know some of the tools but the way that you have put them together is unique.
This is a place where Kubernetes adds value. It provides a set of abstractions that solve a common set of problems. As people build understanding and skills around those problems, they are more productive in more situations.
Although containers networking is prima facie complicated, it is approachable if you have a basic knowledge about networking and time to invest in learning new skills.
In Part II, we are going to dive deep into the technical details. We will understand how Docker containers networking works when we run it in the standalone mode, how multi-container networking works, the differences between standalone containers networking, and multi-container networking. We will also discover how networking is managed across multiple hosts and the fascinating Kubernetes networking world.