Learning k8s, KCNA preparation (Part 4)
- tags
- #KCNA #K8s #Certification #Architecture #Containers #Kubernetes
- categories
- K8s Cloud-Native
- published
- reading time
- 15 minutes
In this installment of this KCNA series, we’ll dive into the fundamentals of Kubernetes and explore how it helps manage containers at scale. Running a few containers can be straightforward, but when dealing with hundreds or thousands of containers, efficient management requires more than basic tools—it needs a sophisticated orchestration platform. This is where Kubernetes excels.
Understanding the Role of Container Orchestration
flowchart TD subgraph ContainerOrchestration["Container Orchestration"] direction TB A[Provisioning] B[Deployment] C[Scaling] D[Standards & Frameworks] E[Core Component Integration] end subgraph Kubernetes["Kubernetes Capabilities"] direction TB F[Availability & Self-healing] G[Resource Management] H[Exposing Services] I[Security & Authorization] J[Autoscaling] K["Custom Resource Definitions (CRDs)"] end A --> G B --> F C --> J D --> K E --> H E --> I
Container orchestration handles critical aspects of deploying and managing containers across multiple environments. Here are some ways container orchestration simplifies and enhances containerized workloads:
- Provisioning: Allocating resources and ensuring services have the infrastructure they need.
- Deployment: Streamlining how containers are created and run.
- Scaling: Managing scaling dynamically to meet demand.
- Standards and Frameworks: Ensuring that best practices are consistently applied.
- Core Component Integration: Seamlessly integrating with storage, network, and authentication services.
Kubernetes stands out in container orchestration due to its focus on reliability, efficiency, and scalability:
- Availability and Self-healing: Automatically restarts failed containers.
- Resource Management: Efficiently schedules resources across nodes.
- Exposing Services: Manages routing and load-balancing for containerized apps.
- Security and Authorization: Integrates security measures for accessing and managing workloads.
- Autoscaling: Scales workloads up or down as needed.
- Custom Resource Definitions (CRDs): Extends Kubernetes functionality to manage custom resources.
Architecture Overview
At its core, Kubernetes follows a client-server architecture, dividing responsibilities between the Control Plane and Nodes:
- Control Plane: Manages the overall state of the cluster.
- Nodes: Execute the containers based on instructions from the Control Plane.
In a basic setup, there is one Control Plane and one Node, but Kubernetes supports high availability through multiple Control Planes and Nodes. In highly available clusters, etcd
, the distributed key-value store, employs a RAFT consensus algorithm to synchronize state across nodes.
flowchart TD subgraph ControlPlane["Control Plane"] direction TB APIServer["KubeAPI Server"] Scheduler["Kube-Scheduler"] Controller["Controller-Manager"] CloudController["Cloud-Controller-Manager (optional)"] etcd["etcd"] end subgraph Nodes["Nodes"] direction TB Kubelet["Kubelet"] KubeProxy["Kube-Proxy"] CoreDNS["CoreDNS"] Runtime["Container Runtime"] end %% Control Plane Relationships APIServer --> etcd APIServer --> Scheduler APIServer --> Controller APIServer --> CloudController %% Node Relationships Kubelet -->|Uses PodSpecs| Runtime Kubelet -->|Monitors| APIServer CoreDNS --> APIServer KubeProxy -->|Handles Networking| CoreDNS %% Node and Control Plane Connections ControlPlane --> Nodes APIServer -->|Coordinates With| Kubelet APIServer -->|Exposes Cluster API| Users %% Optional Control Plane Component CloudController -.->|Cloud-specific Operations| Controller
Container Runtimes
Low-level runtimes handle the core functionality of running containers. They interface with operating system components like Linux namespaces and cgroups to create isolated environments. runc
is the main example here:
runc
: An Open Container Initiative (OCI) runtime, initially created by Docker, which now serves as the reference implementation.- Other examples include crun, kata-runtime, and gVisor.
A high-level runtime, such as containerd
, manages the entire lifecycle of containers and communicates with the low-level runtime:
containerd
: Originally developed by Docker and now part of CNCF,containerd
handles pulling images, storing them, and managing containers through the low-level runtimerunc
.
Core Components of the Control Plane
Kubelet
The kubelet runs on both the Control Plane and Nodes, ensuring that the specified containers (called Pods) are running on the system:
- It uses Pod Specifications (PodSpecs) in YAML or JSON to manage and monitor pod status.
- PodSpecs can be provided via API requests or a designated directory, often
/etc/kubernetes/manifests
. - Kubelet coordinates with
containerd
andrunc
to create static pods essential to the Control Plane. - It also monitors node health e.g., checking CPU, memory, and storage metrics.
Static Pods and Key Control Plane Components
Static pods differ from dynamic pods in that they are defined directly on nodes and aren’t managed by the API server. This is why they are central for bootstrapping essential control plane services.
etcd:
- A distributed, strongly consistent key-value store.
- Provides the “source of truth” for cluster state.
- Supports leader elections and manages data consistency across nodes in highly available clusters.
KubeAPI Server:
- Acts as the primary interface to the Control Plane.
- Provides a RESTful API and manages data stored in
etcd
. - Communicates with kubelet and other components to orchestrate pods.
Kube-Scheduler:
- Assigns pods to nodes based on resources and constraints.
- Ensures efficient use of resources, dynamically adjusting to accommodate workloads.
Kube-Proxy:
- Runs as a DaemonSet, handling network routing and load-balancing for services.
- Configures forwarding for TCP, UDP, and SCTP connections across nodes.
- Uses iptables and IPVS as underlying tools for networking.
CoreDNS:
- A Deployment that handles DNS within the cluster.
- Ensures that pods and services can resolve each other by name.
Controller-Manager:
- Runs various controllers, such as Replication Controller and Node Controller, to maintain desired cluster state.
- Monitors and enforces resource counts and configurations specified in the cluster.
- Some of controllers run by Controller-Manager are ReplicaSet Controller, Endpoint Controller, and Service Account Controller.
Cloud-Controller-Manager (optional):
- Interfaces with cloud providers to support features like load balancers and persistent storage.
- Enables seamless integration of Kubernetes with cloud-specific services, typically only found in managed Kubernetes offerings.
- Cloud-Controller-Manager components like Route Controller and Node Controller handle node lifecycle management in cloud environment, distinguishing this component from on-premise Kubernetes setups.
Managing Resources
Services and Networking
Kubernetes Services expose applications running on Pods as network services. There are four main types of services, each serving a specific purpose:
- Cluster IP: Exposes the service within the cluster.
- NodePort: Exposes the service on each node’s IP at a static port.
- LoadBalancer: Allocates a load balancer IP to expose services externally.
- ExternalName: Maps the service to an external name.
Headless Services are also worth noting. They allow direct access to each pod rather than routing through a single IP, which can be helpful for stateful applications that need unique pod identities.
Port Forwarding with kubectl
Assume we have a pod running an nginx
server. We can use kubectl
’s port forwarding feature to expose this pod’s service locally.
kubectl port-forward pod/nginx 8080:80
With this command, the nginx
server running on port 80
in the pod will be accessible at http://localhost:8080
on our machine. This feature is incredibly useful for accessing resources running in the cluster without external exposure.
Pod-to-Pod Communication
Let’s confirm that pods in a Kubernetes cluster can communicate with each other seamlessly. Assuming we have an nginx
pod running at http://10.42.2.2
, we can start a curl
pod to make requests to it:
kubectl run -it --rm curl --image=curlimages/curl --restart=Never -- http://10.42.2.2
This command deploys a temporary pod running curl
to fetch content from the nginx
pod. Kubernetes’ internal networking simplifies communication between pods, even across multiple nodes, without complex networking configurations.
Managing Resources with YAML
While kubectl
commands are great for quick interactions, Kubernetes relies heavily on YAML files for defining configurations in a declarative way.
Let’s create a basic YAML file to define our nginx
pod. We can generate a sample YAML configuration using kubectl
:
kubectl run nginx --image=nginx --dry-run=client -o yaml | tee nginx.yaml
nginx.yaml
will look something like this:
apiVersion: v1
kind: Pod
metadata:
labels:
run: nginx
name: nginx
spec:
containers:
- image: nginx
name: nginx
dnsPolicy: ClusterFirst
restartPolicy: Always
To deploy this pod, we use the apply command, which is Kubernetes’ preferred way to manage resources.
kubectl apply -f nginx.yaml
Combined YAML files
We can extend our pod management capabilities by combining multiple YAML files into one and applying them at once. Suppose we have both an nginx.yaml
and ubuntu.yaml
file; we can combine and apply them as follows:
{ cat nginx.yaml; echo "---"; cat ubuntu.yaml; } | tee combined.yaml
kubectl apply -f combined.yaml
Using this combined YAML structure, Kubernetes will handle multiple resources in a single operation.
---
signals a new separate sectionCreating a Pod with Multiple Containers
Let’s create a pod that includes two containers: an nginx
web server and a “sidecar” container running a simple loop. This setup can be useful for scenarios like logging or monitoring, where a second container in the same pod supports the main container.
Here’s a sample YAML configuration for a two-container pod:
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: webserver
image: nginx
- name: sidecar
image: ubuntu
args:
- /bin/sh
- -c
- while true; do echo "$(date +'%T') - Hello from the sidecar"; sleep 5; if [ -f /tmp/crash ]; then exit 1; fi; done
restartPolicy: Always
Apply this file:
kubectl apply -f mypod.yaml
With both containers running, we can view pod details with kubectl get pods
and inspect the logs of each container separately:
kubectl logs mypod -c sidecar
Pod Lifecycle Management
Troubleshooting and Testing with kubectl exec
We can use kubectl exec
to run commands inside containers. Suppose we want to simulate a crash in our sidecar container by creating a specific file:
kubectl exec -it mypod -c sidecar -- touch /tmp/crash
This command will trigger an error in the sidecar container, causing Kubernetes to restart it, as specified by the Always
restart policy.
To view logs of a previous instance (after a restart), use the -p
flag:
kubectl logs mypod -c sidecar -p
Init Containers
Init Containers run setup scripts or initialization tasks before the main container in a pod starts. They ensure that essential operations, like configuration checks or dependency preparations, are complete before launching the main app.
To see Init Containers in action, let’s create a countdown-pod
:
apiVersion: v1
kind: Pod
metadata:
name: countdown-pod
spec:
initContainers:
- name: init-countdown
image: busybox
command:
[
"sh",
"-c",
'for i in \$(seq 120 -1 0); do echo init-countdown: \$i; sleep 1; done',
]
containers:
- name: main-container
image: busybox
command:
[
"sh",
"-c",
'while true; do count=\$((count + 1)); echo main-container: sleeping for 30 seconds - iteration \$count; sleep 30; done',
]
To deploy the pod:
kubectl apply -f countdown-pod.yaml
This command sets up an Init Container that counts down from 120, pausing for a second between each count. After the Init Container completes, the main container begins looping every 30 seconds. Use the following command to monitor the logs of the init-countdown
container until it completes and then the main-container
logs:
until kubectl logs pod/countdown-pod -c init-countdown --follow --pod-running-timeout=5m; do sleep 1; done; until kubectl logs pod/countdown-pod -c main-container --follow --pod-running-timeout=5m; do sleep 1; done
Namespaces
Namespaces help organize resources within a Kubernetes cluster, providing logical separation, resource management, and access control. Here’s a quick example of creating and using namespaces.
Create a Namespace:
kubectl create namespace mynamespace
Deploy an Nginx Pod in a Namespace:
kubectl -n mynamespace run nginx --image=nginx
View Pods in a Namespace:
kubectl -n mynamespace get pods
Set Namespace as Default: You can set your default namespace to avoid specifying
-n <namespace>
each time:kubectl config set-context --current --namespace=mynamespace
Now, commands like
kubectl get pods
will operate withinmynamespace
.
Deployments, DaemonSets, and Jobs
Kubernetes offers various ways to manage application workloads, providing the flexibility to handle continuous deployments, node-specific processes, and scheduled tasks.
Continuous and Scheduled Workloads
Deployments and ReplicaSet
Deployments are essential for managing application updates, ensuring availability, and rolling back if necessary. Deployments also handle creating and maintaining ReplicaSets behind the scenes, which manage the actual pod replicas, simplifying scaling and reliability.
Create a Deployment: This command generates a deployment YAML for an nginx pod, then immediately applies it.
kubectl create deployment nginx --image=nginx --dry-run=client -o yaml | tee nginx-deployment.yaml | kubectl apply -f -
Check Deployment:
kubectl get deployments
Benefits of Deployments:
- Replication: Ensures desired pod count is maintained.
- Updates: Phased rollouts prevent downtime.
- Rollbacks: Easily revert if issues arise during updates.
Deployments will create a ReplicaSet behind the scenes, which manages the actual pod replicas. This makes scaling and reliability much simpler to handle.
Jobs
Jobs are Kubernetes objects designed to run a task to completion, whether for batch processing, data crunching,, or other one-off tasks. Jobs allow control over task completion and parallel execution.
Create a Job to Calculate PI: This job uses a Perl container to calculate PI.
kubectl create job calculatepi --image=perl:5.34.0 -- "perl" "-Mbignum=bpi" "-wle" "print bpi(2000)"
Watch the Job Progress:
watch kubectl get jobs
View Logs: Get the pod name and view logs to see the job result.
kubectl get pods -o wide kubectl logs calculatepi
Create YAML Configuration for Parallel Job Execution: With jobs, you can set
completions
andparallelism
to control how many times a job runs and how many pods run simultaneously. Here’s an example job configuration for calculating PI with multiple completions in parallel:apiVersion: batch/v1 kind: Job metadata: name: calculatepi spec: completions: 20 parallelism: 5 template: spec: containers: - command: - perl - -Mbignum=bpi - -wle - print bpi(2000) image: perl:5.34.0 name: calculatepi resources: {} restartPolicy: Never
Run the Job: Apply the job and watch as multiple pods handle the calculations.
kubectl apply -f calculatepi.yaml && sleep 1 && watch kubectl get pods -o wide
Jobs in Kubernetes provide an effective way to run, complete, and repeat tasks as needed.
DaemonSets and CronJobs
DaemonSets
DaemonSets ensure a copy of a pod is running on every node in the cluster. They are ideal for tasks like logging and monitoring agents that need to run across all nodes.
Create a DaemonSet: Here’s an example of a DaemonSet using Fluentd for logging:
apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd spec: selector: matchLabels: name: fluentd template: metadata: labels: name: fluentd spec: containers: - name: fluentd image: fluent/fluentd:latest
Deploy the DaemonSet:
kubectl apply -f daemonset.yaml
CronJobs
CronJobs allow you to schedule Jobs in Kubernetes, enabling you to run tasks at specific times, much like a traditional cron job on Unix systems. They’re ideal for tasks like:
- Generating regular reports
- Running maintenance routines
- Scheduling system or data updates
CronJobs offer time-based scheduling that uses the standard Unix CronJob syntax. They create Jobs according to a defined schedule, with options to set completion count, parallelism, and more. If you’re new to cron syntax, crontab.guru is a handy tool for learning how to define schedules.
Example: Below is a CronJob configuration for a daily report task:
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-report
spec:
schedule: "0 2 * * *" # Runs every day at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: report
image: my-report-image:latest
args: ["generate-report"]
restartPolicy: OnFailure
In this example, Kubernetes runs the job daily at 2 AM, generating reports automatically.
Configuration Management
When managing configurations and sensitive information in Kubernetes, ConfigMaps and Secrets provide two distinct approaches, each with a specific purpose: ConfigMaps handle non-sensitive data, while Secrets securely manage sensitive information. Both are essential for separating configuration from application code, enhancing security and reusability.
ConfigMaps
ConfigMaps provide a way to store non-sensitive configuration data in key-value pairs. They make it easy to keep your configuration separate from application code, allowing for flexible and reusable setups. You can create ConfigMaps directly with literal values, from files, or from directories.
Creating a ConfigMap with Literal Values
kubectl create configmap color-configmap --from-literal=COLOR=red --from-literal=KEY=value
Creating a ConfigMap from a File
Let’s say we save configuration data to a file:
cat <<EOF > configmap-color.properties
COLOR=red
KEY=value
EOF
We can then create a ConfigMap from this file:
kubectl create configmap color-configmap --from-env-file=configmap-color.properties
Viewing a ConfigMap:
kubectl describe configmap color-configmap
ConfigMaps are incredibly flexible and can be used in various scenarios, from environment variables to runtime configurations.
Secrets
While ConfigMaps store non-sensitive data, Secrets are intended for storing sensitive information like API keys, passwords, and other confidential data. Kubernetes Secrets are similar to ConfigMaps, but with extra restrictions for handling sensitive information.
etcd
, where Secrets are stored. Or consider alternatives for handling highly confidential data.Creating a Secret with Literal Values
Creating a Secret is similar to ConfigMaps, but with a different command:
kubectl create secret generic color-secret --from-literal=COLOR=red --from-literal=KEY=value
This command will create a Secret containing base64-encoded key-value pairs, allowing applications to access sensitive data without hardcoding it into the application code. To inspect the Secret values:
Viewing a Secret
To view the contents of a Secret:
kubectl describe secret color-secret
ConfigMaps and Secrets provides an efficient and secure way to manage both configuration data and sensitive information, keeping the application code clean and secure.
Labels
Labels in Kubernetes are crucial for organizing, identifying, and selecting resources. Labels, represented as key-value pairs, can be added to any resource, enabling logical grouping, filtering, and management of related resources. As your cluster environment grows, a consistent labeling strategy will simplify management and allow for efficient resource organization.
Adding a Label to a Pod
Here’s a basic YAML configuration that adds a run: nginx
label to a Pod:
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
run: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Applying this configuration with:
kubectl apply -f nginx-pod.yaml
To list all resources with a specific label, use the selector flag:
kubectl get all --selector run=nginx
A clear labeling strategy helps keep your Kubernetes environment organized and manageable, especially as your resources grow.
Best Practices for Resource Management
When managing Kubernetes resources, consider these approaches:
Declarative over Imperative: Use
apply
instead ofcreate
, asapply
is declarative and compares the current cluster state with the desired state.Separate Configurations into YAML Files: Modular YAML files make it easier to update, track, and manage resources individually or in groups.
Utilize
kubectl
Commands: Leveragekubectl explain
to explore available configurations for different resource types and fields:kubectl explain pod.spec.dnsPolicy
Conclusion
This turned in to a long post and we’re only scratching the surface of this behemoth, but these hands-on examples should help you feel comfortable deploying and managing basic resources. In the next post, we’ll go deeper into exciting subjects like, the Kubernetes API, RBAC, Scheduling, Helm and more.