In the previous blog, We learned about basic Kubernetes components. If you've skipped that one, you might have a hard time understanding this blog. Therefore, I recommend reading the previous blog here. As I mentioned, I will be creating a series of Kubernetes articles so that you can easily access all the resources.
Let's take a recap of what we have learned in previous blogs.
You can skip this part if you want; it's just an overview of the previous blog.
We learned that pods are an abstraction over a container. Instead of communicating with containers directly, we can interact with the Kubernetes layer. Since pods are ephemeral and can die, our communication with other pods may stop, which is why we have services. Services provide a stable endpoint for pods. To interact with the cluster from outside, we have ingress. Ingress provides HTTP or HTTPS routes inside the cluster. To store configurations and some important data, we have configmaps and secrets, e.g., database URLs and passwords. Data persistence is always an issue with containers, which is why we have volumes. Think of volumes as external hard drives. We create deployments for automating pod scheduling and healing. Since deployments only work with stateless apps, we use statefulsets to work with stateful apps like databases.
Now let's look at the architecture of Kubernetes; then, you will understand more about it. In Kubernetes, we have two very important concepts: one is the control plane, and the other one is the worker node. Let's talk about them in this blog
Control Plane ☸️
The control plane in Kubernetes manages the cluster and ensures it runs smoothly. It was previously referred to as the "master node," an older term for the node that runs the control plane components. These components typically operate on a single "master" node and manage the worker nodes in the cluster.
API Server 🚀
The Kubernetes API server is a crucial component in the Kubernetes control plane. Its main function is to expose the Kubernetes API, allowing users to interact with the cluster and manage applications running within it. Additionally, the API server handles authentication and authorization of API requests and stores objects in etcd.
The API server operates within the cluster as a pod and exposes a RESTful API on port 443, which is the default configuration. This API server endpoint serves as the gateway through which users and other system components can interact with the cluster. Using the API server, users can perform actions like creating, deleting, reading, and updating Kubernetes objects such as pods, deployments, and services. It also provides information about nodes, pods, and other cluster resources.
Furthermore, the API server plays a vital role in validating and configuring all API objects within the cluster. For example, when a user initiates the creation of a pod, the API server checks the pod's specifications for validity, verifies the availability of requested resources, and ensures the existence of service accounts. If the pod meets these criteria, the API server stores it in etcd and schedules its execution on a node.
Scheduler 🚀
Kubernetes includes a built-in scheduler that automatically places new Pods onto Nodes. The scheduler ensures that the workload is distributed evenly across the cluster and that the resource requirements of each Pod are met. The scheduler filters out nodes that do not meet the Pod's resource requirements or constraints. This leaves a set of "feasible nodes" where the Pod can run. The scheduler assigns a score to each feasible node based on the node's suitability for running the Pod. The node with the highest score is selected.
However, in some cases, you may want to manually specify where a Pod should run. Kubernetes allows you to do this through manual scheduling.
etcd 🚀
Etcd serves as a distributed key-value store that Kubernetes relies on to store its cluster data. It functions as the central repository for the Kubernetes control plane, enabling Kubernetes to keep track of all the applications and workloads within the cluster.
This includes essential information such as the cluster's composition (nodes, roles, and statuses of nodes) and the data for ConfigMaps and Secret objects. Essentially, etcd acts as the "brain" of Kubernetes, ensuring that all cluster-related data flows through it, providing a single source of truth for various Kubernetes components.
Etcd employs a Raft consensus algorithm to maintain a highly available replicated datastore. As long as a majority of etcd nodes remain operational, etcd can continue to provide its services.
Controller Manager 🚀
Controllers are control loops that watch the state of your Kubernetes cluster and make changes to move the current state towards the desired state.
For example, let's say you create a Deployment that specifies you want 3 replicas of a Pod. The Deployment Controller will:
Compare the actual state (number of existing Pods) with the desired state (3 replicas specified in Deployment).
If there are less than 3 Pods running, it will create more Pods to reach the desired replica count.
If some Pods die or become unresponsive, it will create new Pods to replace them and maintain 3 running Pods.
It will continuously monitor the Pods and take actions to ensure the desired state of 3 replicas is always maintained.
This is the basic working of a Controller, it watches the cluster state and takes actions through the API server to move the current state to the desired state.
cloud-controller-manager 🚀
The Cloud Controller Manager is a part of Kubernetes that separates the specific cloud-related duties from the Kubernetes master. It oversees the life cycle of the cloud provider resources utilized in Kubernetes.
Some of its tasks include:
Managing cloud resources such as load balancers and storage volumes, involves creating, updating, and deleting these resources.
Connecting with the cloud provider's API to gather information about nodes, enabling Kubernetes to stay informed about the cluster's nodes.
Assigning cloud provider-specific identifiers like volume IDs and security group IDs to Kubernetes resources.
Handling updates regarding nodes from the cloud provider, such as node creation, deletion, or changes in metadata. The Cloud Controller Manager informs Kubernetes of these changes.
So in simple terms, the Cloud Controller Manager handles all the communication between Kubernetes and the underlying cloud provider. This separates cloud-specific logic from the main Kubernetes components and makes Kubernetes cloud agnostic.
Worker Node ☸️
A worker node in Kubernetes is a virtual machine (VM) or physical server that runs the Kubernetes control plane and container runtime to host Kubernetes pods and containers. Worker nodes in Kubernetes are where the actual work happens. They are responsible for executing and running containerized applications
Container Runtime 🚀
The container runtime in a Kubernetes worker node is responsible for executing container images and managing the lifecycle of containers.
The job of a container runtime in Kubernetes:
Pull container images from a registry and store them locally.
Create and run containers based on the pulled images.
Monitor the running containers and restart them if needed.
Kill containers and clean up resources when they are no longer needed.
When a Pod is scheduled to a worker node in Kubernetes, the container runtime on that node will pull the required container images, and create and run the containers defined in the Pod Spec. The runtime will then manage the entire lifecycle of those containers.
Kubelet 🚀
Kubelet acts as an agent on every node within a Kubernetes cluster. Its main role is to ensure that the containers specified in the PodSpecs are running smoothly and in good health. It accomplishes this by tasks such as checking for the latest container images, creating and managing containers, keeping an eye on their resource usage, and restarting them if they fail. Additionally, it maintains communication with the Kubernetes API server to register itself as a node, receive assigned PodSpecs, provide node health updates, and report container statuses. Kubelet fetches images from the image registry and initiates container processes on the node. It handles basic container lifecycle functions like creating, deleting, and monitoring resources. Moreover, it exposes the containers running on the node through the Docker API, enabling tools like kubectl to interact with them
In simple terms, kubelet is the node agent that manages the Pods and containers running on a machine in the Kubernetes cluster. It ensures that the containers are running and healthy based on the PodSpecs it receives.
Kubeproxy 🚀
Kube-proxy serves as a network proxy on every node within the Kubernetes cluster, fulfilling several key functions:
Service Discovery and Load Balancing: Kube-proxy directs traffic from Services to Pods using various load balancing techniques. It monitors the Kubernetes API for newly created Services and Endpoints, subsequently configuring network rules accordingly.
Service Port Abstraction: Kube-proxy intercepts connections destined for Services and reroutes them to the appropriate Pods. This facilitates the use of virtual IPs and ports by Services, rather than exposing the Pods directly.
Protocol Independence: Kube-proxy provides support for both TCP and UDP Services.
In summary, kube-proxy plays a crucial role in enabling Kubernetes Services, effectively functioning as a straightforward network-level load balancer. It intercepts requests aimed at Services and forwards them to the respective backend Pods, offering the advantage of treating Pods as replaceable, ephemeral resources.
I hope you learned something from this blog. If you have, don't forget to drop a like, follow me on Hashnode, and subscribe to my Hashnode newsletter so that you don't miss any future posts. You can also reach out to me on Twitter or LinkedIn. If you have any questions or feedback, feel free to leave a comment below. Thanks for reading and have a great day!