I will try to give you an overview about what I have learned in my first days working with Kubernetes. I did not have a plan originally besides the fastest possible migration of our application to Kubernetes.
I ended up spending my week as follows:
Although this lists seems to be straight forward in reality it was more like a roller coaster ride.
The more I learned about Kubernetes the lesser a software application it appeared to be - and I understood Kubernetes as a collection of software contracts and well-designed interfaces.
These interfaces try to solve distributed computing problems, e.g. scaling, networking, storage, deployment and operating high available applications at scale.
kind is a tool for running local Kubernetes clusters using Docker container “nodes”. kind was primarily designed for testing Kubernetes itself, but may be used for local development or CI.
As written on the kind website kind uses docker to create all necessary components to start a Kubernetes cluster. In contrast to minikube - another Kubernetes SIGs project - no additional vm is created and the host kernel is fully utilized with as little overhead as possible.
"SIG" stands for "Special interest groups" and are subprojects of Kubernetes which helped the kubernetes team to further scale out the core project by giving more responsibility to individual maintainer groups.
Following the tutorials on the kind website let to quick success. Kind will create a Docker container that functions as master and worker node the same time - an one node cluster. The Kubernetes management port
6443 will be exposed to the host network and
kubectl can be setup easily following their documentation.
The Kubernetes dashboard installation is straight forward as well and gives you a graphical representation of your cluster.
So I saw a graphical representation of the cluster and already some "Hello World" http endpoints worked: Nice!
On day two I realized that there are still two big features missing in my local Kubernetes cluster: Persistent storage and traffic routing.
The storage topic for distributed system can be endlessly complex. Kubernetes provides APIs for multiple solutions. Starting with simple host mounted volumes and ending up by supporting cloud providers native storage solutions like awsElasticBlockStorage, azureDisk oder gcePersistentDisk.
Kind provides examples for using host mounted volumes for persistence which work similar to docker volumes. They worked flawlessly and could theoretically be shared between our local docker and kind setup.
After solving the persistence problem on the local cluster I started to look into networking. Kubernetes creates its own internal DNS server. Applications can call other applications by their configured service name. On my kind setup with only 1 node this worked perfectly fine. Pods could reach the Kubernetes DNS server which then resolved the service name to another docker container.
I didn't foresee back than that networking will get incredibly more complex once the cluster was distributed over multiple nodes!
With our basic application running in Kubernetes I started to setup three bare metal servers running CentOS the next day.
It seemed to be that Kubernetes evolves in a rapid speed. Multiple tutorials on the internet use outdated Kubernetes yaml versions or require opening different ports on firewalld. One of the better and simpler tutorials I have found that automate the Kubernetes master and worker node setup is following GitHub repository: ctienshi/kubernetes-ansible I will create a pull request for the broken network setup with missing firewalld rules and missing route mapping into the containers.
CentOS 7 and CentOS 8 are solid choices for running your bare metal solutions. Depending on the used hardware and needed drivers I would choose the used version accordingly.
By using Ansible for setting up your servers you can easily add new servers and install missing packages on all servers with a single command. Especially if you manage multiple servers in multiple networks you can ensure that all servers are up to date and a newly added server is setup the exact same way as your current servers.
After initializing kubernetes you can easily join the worker nodes to the master with the join token provided at the end of the kubernetes master setup with kubeadm.
Before connecting to your new cluster you need to point your local kubectl cli at your newly created cluster by switching the context to point to the new master server.
In the future I may look into another Kubernetes SIG project kubernetes-sigs/kubespray or Ranger which further automates the deployment of Kubernetes clusters.
As I previously mentioned networking is rather straight forward if your kubernetes cluster has only one node using kind. You have only one interface and all routes are available by default.
If you think about a multi node cluster you have probably following setup
|Server||Server IP||Pod||Pod IP|
So how can pod-b with the IP address 10.0.2.2 on worker-a communicate with pod-c with the ip 10.0.3.1 on worker-b?
Multiple plugins of the containernetworking/cni project already solved this problem. The CNI is another is interface that is used across various container runtimes and describes how containers can communicate across different machines.
I used the coreos/flannel. Using the default kube-proxy CNI might also work. If you are running Kubernetes inside a cloud provider, the cloud provider has already a CNI implementation that can plug right into your Kubernetes cluster and help each container reach any other container.
If you use flannel you need to initialize your Kubernetes cluster with the pod network cidr 10.244.0.0/16.
After each container can communicate with each container via their service names - which they can now also correctly resolve and reach, we still face the problem of how we can expose these services to the outside world!
Again, a cloud service provider would have you covered by providing a load balancer that your services can connect to, that does ssl offloading and distributes the traffic across the deployments of the service.
But how does load balancing in real bare metal environments work?
MetalLB got you covered! Previously we said that each server has an IP address in the range between 192.168.2.10 and 192.168.2.20. Your default gateway will probably be located at 192.168.2.1. Each server in the /24 network will send out an ARP requests to determine the MAC address of the target IP. So a switch can route your traffic on layer 2 to the correct destination.
We can now configure MetalLB with virtual IP addresses in our server subnet - for example 192.168.2.22. MetalLB will now respond to these ARP requests for a pod-b on worker node worker-a. If worker-a goes offline kubernetes transfers all pods to other available worker nodes. On the next ARP request MetalLB will respond to the IP 192.168.2.22 on worker-b. So we can always reach our cluster by our MetalLB IP address!
In combination with ingress-nginx and cert-manager traffic distribution and ssl offloading can be realized.
Don't be confused by multiple ingress controller implementations, ingress itself is also only an interface that describes how load balancers should traffic across your Kubernetes network.
That's your part! If you need any help or hands-on support you can can contact me via Twitter, LinkedIn or E-Mail. You can find more information about me on my frontpage.
I hope that you can now understand why Kubernetes for me is lesser an application and more a collection of cloud computing interfaces that provide efficiency, easy cloud service provider migration (in theory). It also creates a common understanding of cloud computing problems nowadays and guides you into possible solution to help you build a higher quality internet!
If you have any questions, just hit me up on Twitter:
What I have learned in 5 days working with Kubernetes - no code just thoughts: #Kubernetes #Docker #kind #ingress #MetalLB #Flannel #firewalld #CloudComputing #networking
— Philip Miglinci (@pmigat)