Moving my websites to Kubernetes

Since for the better part of two weeks now, I’ve been learning Kubernetes for the purpose of converting my previously docker-compose based stack to it. Prior to the website move, I made sure to learn some of the foundational and surrounding technologies to at least an adept-level:

Docker
GitLab’s ci-cd, runner usage, etc.
DigitalOcean Kubernetes service

At the end of the day, here were my big takeaways as they applied to my site:

Kubernete’s entire running configuration is controlled via its API. The API contains API objects which may be modifed by hand or automatically via some sort of process or controller.
You can use the kubectl cmdline program to interact with the Kubernetes API
pods: The foundational Kubernetes building block is a called a pod
pods are kind of like containers, as they have a single IP addresses, but they may be composed of multiple containers
For example, take a nginx + php pod. It’s composed of two containers, php and nginx. nginx listens on the pod’s :80 port, and php listens on the pod’s container’s :9000 port. The pod could share the /code directory between containers to share common data, though my setup does not put nginx and php together like this.
deployments: Pods are put into place with a deployment. Deployments specify what’s in the pod, and how many replicas of the pod there should be
services: Another abstraction put into place, to help with routing. They point to deployments using selectors
Selectors are an attribute of deployments. Take the deployment nginx with the selectors {app: nginx, tier: frontend}. An nginx service could declare that it load balances to pods whose deployment has the selectors {app: nginx, tier: frontend}. This helps achieve high availability for a deployment
Pods that are unhealthy are destroyed and recreated automatically, as dictated by the deployment. They are dynamically removed and re-added to the service for load balancing as needed
ingress API objects that help manage external access to services in the cluster
Ingresses help with load balancing, SSL termination and DNS and directory based virtual hosting. For example, direct http requests to develop.worobetz.ca to service nginx-develop in namespace worobetz-website-3789083.
namespaces help organize multiple projects into their own little grouping. They help keep apps separate.
nodes: Kubernetes clusters are composed of nodes. Nodes are simply servers, either physical or virtualized. You don’t need to think about them too much other than capacity planning.

Whew, that was a lot for just the foundation! And there’s so much more to it.

It’s always been my opinion that the best way to learn new things is to start using them. To that effect, if you’re looking to dive in to Kubernetes, the best way is to chose a simple project (e.g. “I want to host my static website in Kubernetes”) and muddle through it until you’re done. Then, go back and make improvements. I “completed” my project multiple times, only to go back and redo it to make minor improvements, learning from my mistakes as I went. This is a great way to learn things, and it’s the way I learned Kubernetes.

So before we get into the nitty gritty details, lets start with an overview. The below diagram outlines how HTTP (and a little DNS) traffic gets routed through my Kubernetes infrastucture to a single website (worobetz.ca)

graph TD CLIENT[Website visitor] DNSHosts[*.worobetz.ca, worobetz.ca, *.blog.worobetz.ca] LB[DigitalOcean Load Balancer] Node1[Node #1] Node2[Node #2] IngressService[Shared Ingress Service] Ingress1[Ingress Pod #1] Ingress2[Ingress Pod #2] NginxService[worobetz.ca nginx Service] NginxPod1[Nginx Pod #1] NginxPod2[Nginx Pod #2] PHPService[worobetz.ca PHP Service] PHPPod1[PHP Pod #1] PHPPod2[PHP Pod #2] CLIENT --> |#1: DNS Lookup| DNSHosts CLIENT --> |#2: HTTP request| LB LB --> Node1 LB --> Node2 Node1 --> IngressService Node2 --> IngressService IngressService --> Ingress1 IngressService --> Ingress2 Ingress1 -->|HOST: worobetz.ca| NginxService Ingress2 -->|HOST: worobetz.ca| NginxService NginxService --> NginxPod1 NginxService --> NginxPod2 NginxPod1 -->|php-master:9000| PHPService NginxPod2 -->|php-master:9000| PHPService PHPService --> PHPPod1 PHPService --> PHPPod2

Lets start with the ingress. I’ve deployed the initial ingress setup to my Kubernetes cluster by installing it from GitLab’s UI. Just by doing that, Kubernetes knows it needs to create the listed DigitalOcean Load Balancer with its own public IP, if it doesn’t already exist. I can go into my DigitalOcean control panel and modify this Load Balancer if needed, but that would probably be a bad idea as that would cause configuration drift. The LoadBalancer balances traffic to two Kubernetes nodes. The nodes route the traffic to the ingress service, which in turn does round-robin load balancing to ingress pods located on separate nodes. Once the ingress pod recieves a user’s HTTP request, it looks at the header. It routes traffic to the appropriate service using the requests host field, which might look something like develop.blog.worobetz.ca or worobetz.ca. In the case of the ingress configuration for worobetz.ca, it routes requests for HOST: worobetz.ca to the service named nginx-master. It knows how to do this by using the rule defined in my repositories .ci-cd\k8s\ingress.yml file.

NOTE: There’s more in-depth information on cluster internal networking here, but essentially all the nodes in the cluster are part of a private network, enabling networking without concern as to which node a service or pod is located. It’s all in the same network.

Now the HTTP request flows to the nginx service’s internal cluster IP. Again, it does round-robin load balancing to one of the service’s pods, in this case one of nginx Pod #1 or nginx Pod #2. Now this might be the end of an HTTP requests journey, if my website didn’t have a PHP component. But it does! So what now? Well, the Kubernetes way is microservices, and I figured PHP can be just as much of a microservice as nginx can be. In this way, nginx and PHP can scale independently if need be. Nginx routes requests to PHP-fpm via a normal host:port syntax, so that makes it easy. Lets take a look at our nginx’s default.conf, as described in a Kubernetes configMap named nginx_configMap.yml. Notice the section fastcgi_pass php-<CI_BUILD_REF_NAME>:9000. <CI_BUILD_REF_NAME> gets replaced with the branch name on deployment (Using sed in .gitlab-ci.yml). nginx will now forward PHP requests to the PHP service, which uses round-robin load balancing to distribute traffic to the PHP pods. Huzzah, highly available, seperately scalable PHP!