Files
kubernetes/README.md
2026-02-28 17:20:39 +00:00

664 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# kubernetes
Kubernetes is an opensource platform that automates the deployment, scaling, and management of containerized applications. It acts as an orchestrator, ensuring your containers run reliably across clusters of machines, handling networking, storage, and updates without downtime.
- [Namespaces](#namespaces)
- [Create namespace](#create-namespace)
- [Pods](#pods)
- [Create an pod](#create-an-pod)
- [Get Pod](#get-pod)
- [delete Pod](#delete-pod)
- [OOMKilled](#oomkilled)
- [Attach to an pod](#attach-to-an-pod)
- [Run command on pod](#run-command-on-pod)
- [Persistent volumes](#persistent-volumes)
- [find persistent volume used pvc](#find-persistent-volume-used-pvc)
- [Patch pv - change to retain policy](#patch-pv---change-to-retain-policy)
- [Patch pv - remove finalizers](#patch-pv---remove-finalizers)
- [kubectl](#kubectl)
- [Helper pods](#helper-pods)
- [network testing](#network-testing)
- [Resources](#resources)
- [Services Accounts](#services-accounts)
- [Secrets](#secrets)
- [Manifest - Opaque / Base64](#manifest---opaque--base64)
- [Manifest - StringData](#manifest---stringdata)
- [Inline with heredoc and environment variables](#inline-with-heredoc-and-environment-variables)
- [substr](#substr)
- [nodes](#nodes)
- [taint nodes](#taint-nodes)
- [control plane - NoSchedule](#control-plane---noschedule)
- [statefulset](#statefulset)
- [statefulset - Set Replicas](#statefulset---set-replicas)
- [Deployment](#deployment)
- [Deployment - Set Replicas](#deployment---set-replicas)
- [Deployment - Restart](#deployment---restart)
- [certs](#certs)
- [list all certs](#list-all-certs)
- [get cert end date](#get-cert-end-date)
- [service accounts](#service-accounts)
- [core-dns](#core-dns)
- [Services DNS Name](#services-dns-name)
- [Custom Resource Definitions](#custom-resource-definitions)
- [k3s](#k3s)
- [Install / Setup](#install--setup)
- [prune old images](#prune-old-images)
- [check system logs](#check-system-logs)
- [Workarounds \& Fixes](#workarounds--fixes)
- [Failed unmounting var-lib-rancher.mount on reboot](#failed-unmounting-var-lib-ranchermount-on-reboot)
- [klipper-lb](#klipper-lb)
- [troubleshooting](#troubleshooting)
## Namespaces
### Create namespace
Using cli
``` bash
kubectl create namespace tests
```
Or using yaml
``` yaml
apiVersion: v1
kind: Namespace
metadata:
name: namespace-name
labels:
name: namespace-name
```
## Pods
### Create an pod
**Create an ubuntu pod for tty access example:**
``` bash
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-test
namespace: tests
spec:
#### deploy to an specific node
nodeName: chimera-gluten
containers:
- name: ubuntu-test
image: ubuntu
# In Kubernetes, the pod stays alive as long as PID 1 is running.
# so with this options:
# - It does not exit automatically.
# - It waits for user input forever.
# - It behaves like an interactive shell session.
command: ["sh"] # PID 1 = interactive shell
stdin: true # keep STDIN open
tty: true # allocate a terminal
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: data-pvc
```
**Create an ubuntu pod with and execute command:**
``` bash
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-ls-test
namespace: tests
spec:
restartPolicy: Never # executes only one time, no retry on error
#
# nodeName: "serverExample01" # restrict to an specific node
#
containers:
- name: ubuntu-seaweedfs-test
image: ubuntu
command: ["bash", "-c"]
args:
- "ls -lah /data"
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: data-pvc
```
### Get Pod
**Get pod name by label ap:**
```bash
POD_NAME=$(kubectl get pod -l app=myAppName -n appNamespace -o jsonpath='{.items[0].metadata.name}')
echo $POD_NAME
```
**Get pod name by text on description, for example find by ip:**
``` bash
kubectl get pods -A -o wide | grep 10.0.3.224
```
### delete Pod
``` bash
kubectl delete pod -n appNamespace -l app=myAppName
```
### OOMKilled
**list all OOMKilled pods:**
``` bash
kubectl get events --all-namespaces | grep -i "OOMKilled"
```
``` bash
kubectl get pods --all-namespaces \
-o jsonpath='{range .items[*]}{.metadata.namespace}{" "}{.metadata.name}{" "}{.status.containerStatuses[*].lastState.terminated.reason}{"\n"}{end}' \
| grep OOMKilled
```
### Attach to an pod
Attach connects your terminal to the main process of the container (PID 1), or another running process if specified.
Use it when you want to:
- see the raw output of the main process
- want to send input directly to the main process
``` bash
kubectl attach -it myPodName -n appNamespace
```
``` bash
POD_NAME=$(kubectl get pod -l app=myAppName -n appNamespace -o jsonpath='{.items[0].metadata.name}')
kubectl attach -it ${POD_NAME} -n appNamespace
```
### Run command on pod
``` bash
# sh
POD_NAME=$(kubectl get pod -l app=myAppName -n appNamespace -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it ${POD_NAME} -- sh
```
``` bash
# bash
POD_NAME=$(kubectl get pod -l app=myAppName -n appNamespace -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it ${POD_NAME} -- bash
```
``` bash
# execute an command like ls
POD_NAME=$(kubectl get pod -l app=myAppName -n appNamespace -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it ${POD_NAME} -- ls /
```
## Persistent volumes
### find persistent volume used pvc
``` bash
NAMESPACE=???
PVC_NAME=???
PV_NAME=$(kubectl get pvc $PVC_NAME -n $NAMESPACE -o jsonpath='{.spec.volumeName}')
echo "${PV_NAME}"
```
### Patch pv - change to retain policy
``` bash
PV_NAME="???"
kubectl patch pv $PV_NAME \
-p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
```
### Patch pv - remove finalizers
``` bash
PV_NAME="???"
kubectl patch pv $PV_NAME \
-p '{"metadata":{"finalizers": null}}'
```
## kubectl
kubectl is the commandline tool used to interact with Kubernetes clusters. Think of it as the “remote control” for Kubernetes: it lets you deploy applications, inspect resources, and manage cluster operations directly from your terminal.
### Helper pods
#### network testing
``` bash
kubectl run -i --tty dns-test --namespace tests --image=busybox --restart=Never --
kubectl delete pod dns-test --namespace tests || 0
```
**Example using yaml and hostNetwork:**
- Create Pod
```yaml
apiVersion: v1
kind: Pod
metadata:
name: dns-test
namespace: tests
spec:
hostNetwork: true
containers:
- name: dns-test
image: busybox
command: ["sh"]
stdin: true
tty: true
```
- Attach to Pod
```bash
kubectl attach -it dns-test -n tests
```
- Execute command inside pod.
``` bash
nslookup google.com
```
- Delete pod
```bash
kubectl delete pod dns-test --namespace tests
```
### Resources
**List all resources:**
```bash
kubectl get all -n kube-system | grep traefik
```
**List service accounts:**
```bash
kubectl get serviceAccount --all-namespaces
```
### Services Accounts
**List all:**
```bash
kubectl get serviceAccount --all-namespaces
```
**Get Service Account Token:**
```bash
kubectl get secret <secret_name> -o jsonpath='{.data.token}' | base64 -d
```
```bash
kubectl get secret <secret_name> -o jsonpath='{.data.token}' | base64 -d > ./service-account-secret-base64
```
**Get Cluster certificate Base64:**
```bash
kubectl config view --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}'
```
## Secrets
### Manifest - Opaque / Base64
```yaml
apiVersion: v1
kind: Secret
metadata:
name: secret-name
namespace: namespace-name
type: Opaque
data:
SERVER_ADDRESS: MTI3LjAuMC4x # 127.0.0.1 BASE64
```
### Manifest - StringData
```yaml
apiVersion: v1
kind: Secret
metadata:
name: secret-name
namespace: namespace-name
stringData:
SERVER_ADDRESS: 127.0.0.1
```
### Inline with heredoc and environment variables
``` bash
SERVER_ADDRESS=127.0.0.1
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: secret-name
namespace: namespace-name
stringData:
SERVER_ADDRESS: ${SERVER_ADDRESS}
EOF
```
### substr
**yaml secret template:**
``` yaml
# ./secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: secret-name
namespace: namespace-name
stringData:
SERVER_ADDRESS: ${SERVER_ADDRESS}
```
``` bash
export SERVER_ADDRESS="127.0.1"
envsubst < ./secret.yaml | kubectl apply -f -
```
**env file and envsubst:**
``` bash
#---
# ./.env
# content:
# SERVER_ADDRESS=127.0.0.1
#---
set -a
source ./.env
set +a
envsubst < ./secret.yaml | kubectl apply -f -
```
## nodes
**Get nodes info:**
```bash
kubectl get nodes -o wide
```
**get node taints:**
``` bash
kubectl describe node <NODE_NAME> | grep taint
```
**remove annotation:**
``` bash
kubectl annotate node <NODE_NAME> <ANNOTATION_NAME>-
```
### taint nodes
#### control plane - NoSchedule
``` bash
MASTER_NODE_NAME="master-node-name"
kubectl taint nodes ${MASTER_NODE_NAME} node-role.kubernetes.io/control-plane=:NoSchedule
```
## statefulset
### statefulset - Set Replicas
```bash
kubectl patch statefulset <statefulset-name> \
-p '{"spec":{"replicas":0}}'
```
## Deployment
### Deployment - Set Replicas
``` bash
DEPLOYMENT_NAME="???"
# example with 0 to "disable" deployment
kubectl scale deployment ${DEPLOYMENT_NAME} --replicas=0
```
```bash
kubectl patch deployment <deployment-name> \
-p '{"spec":{"replicas":0}}'
```
### Deployment - Restart
**example restart coredns:**
``` bash
kubectl rollout restart deployment coredns -n kube-system
```
## certs
### list all certs
```bash
kubectl get cert -n default
```
### get cert end date
``` bash
kubectl get secret certificate-name-tls -o "jsonpath={.data['tls\.crt']}" | base64 --decode | openssl x509 -enddate -noout
```
## service accounts
**Get service account token:**
```bash
kubectl get secret continuous-deploy -o jsonpath='{.data.token}' | base64 -d
```
## core-dns
Kubernetes automatically provides DNS names for Services and Pods, and CoreDNS serves these records. This allows workloads to communicate using stable, predictable names instead of changing IP addresses.
### Services DNS Name
```text
<service-name>.<namespace>.svc.<cluster-domain>
```
Remove warning from logs.
```log
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
[WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override
```
1. Apply on kubernetes
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
log.override: |
#
stub.server: |
#
```
## Custom Resource Definitions
- **Definition:** A Custom Resource Definition (CRD) is an extension of the Kubernetes API.
- **Purpose:** They allow you to define new resource kinds (e.g., Database, Backup, FooBar) that behave like native Kubernetes objects.
- **Analogy:** By default, Kubernetes understands objects like Pods and Services. With CRDs, you can add your own object types and manage them with kubectl just like builtin resources
**List traefik CRDS:**
```bash
kubectl get crds | grep traefik
```
## k3s
K3s is a lightweight, certified Kubernetes distribution designed to run in resourceconstrained environments such as edge devices, IoT appliances, and small servers. It simplifies installation and operation by packaging Kubernetes into a single small binary, while still being fully compliant with the Kubernetes API.
🌐 What K3s Is
- Definition: K3s is a simplified Kubernetes distribution created by Rancher Labs (now part of SUSE) and maintained under the CNCF.
- Purpose: Its built for environments where full Kubernetes (K8s) is too heavy — like Raspberry Pis, edge servers, or CI pipelines.
- Size: The entire distribution is packaged into a binary under ~70MB.
### Install / Setup
**Default master installation:**
``` bash
curl -sfL https://get.k3s.io | sh -
```
Install specific version and disable:
- flannel (alternative example calico)
- servicelb (alternative example metallb)
- traefik (then install using helm chart or custom manifests for better control)
```bash
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.33.3+k3s1 INSTALL_K3S_EXEC="--flannel-backend=none \
--disable-network-policy \
--cluster-cidr=10.42.0.0/16 \
--disable=servicelb \
--disable=traefik" \
sh -
```
### prune old images
prune old images, execute on kubernetes host node
```bash
crictl rmi --prune
```
### check system logs
```bash
sudo journalctl -u k3s-agent --since "1h ago" --reverse --no-pager | more
sudo journalctl -u k3s-agent --since "1 hour ago" --reverse | grep -i "Starting k3s-agent.service"
sudo journalctl -u k3s --reverse | grep -i "Starting k3s.service"
```
*Example: [test-services.services.svc.cluster.local](test-services.services.svc.cluster.local).*
### Workarounds & Fixes
#### Failed unmounting var-lib-rancher.mount on reboot
When running K3s with /var/lib/rancher on a separate disk.
K3s and containerd often leave behind mount namespaces and overlay layers that block clean unmounting during shutdown.
This causes slow reboots and errors like:
``` bash
Failed unmounting var-lib-rancher.mount
```
1. Create the cleanup service
``` bash
nano /etc/systemd/system/rancher-cleanup.service
```
Paste:
``` bash
[Unit]
DefaultDependencies=no
Before=shutdown.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c '/bin/umount -l /var/lib/rancher || true'
[Install]
WantedBy=shutdown.target
```
Why this works
- DefaultDependencies=no ensures the service runs early.
- Before=umount.target guarantees it executes before systemd tries to unmount anything.
- umount -l detaches the filesystem immediately, even if containerd still holds namespaces.
- || true prevents harmless “not mounted” errors from blocking shutdown.
1. Reload systemd
``` bash
systemctl daemon-reload
```
1. Enable the cleanup service
```bash
systemctl enable rancher-cleanup.service
```
1. Reboot to test:
``` bash
reboot
```
### klipper-lb
KlipperLB is the tiny, builtin load balancer that k3s uses to give each agent a local, stable endpoint for talking to the Kubernetes API server. Instead of exposing a full external load balancer, k3s runs this lightweight component on 127.0.0.1:6444, and it simply forwards traffic from the agent to the controlplane node (or rotates between multiple servers in an HA setup). It exists to make k3s simpler to deploy—no extra software, and no external LB. startup even though the cluster continues working normally.
#### troubleshooting
**log: warning - Error starting load balancer: listen tcp 127.0.0.1:6444: bind: address already in use.**
``` bash
rm -rf /var/lib/rancher/k3s/agent/etc/klipper-lb
systemctl restart k3s-agent
```