K8s 部署记录
根据官方操作记录一下,方便后期快速部署查看
kubeadm
使用kubeadm来初始化集群,目前最新的版本是v1.34
服务器debian13三台
192.168.31.110 master
192.168.31.111 node1
192.168.31.112 node2
每台要求至少2G运行内存和2个CPU核心,三台服务器可以互相访问,三个服务器的 MAC 地址和 product_uuid不能重复(直接克隆的虚拟机会有重复的可能)
在各个服务器上使用hostnamectl设置好对应的主机名,比如在在192.168.31.110执行hostnamectl set-hostname master
在各个服务器上把hosts加上
cat <<EOF | sudo tee -a /etc/hosts
192.168.31.110 master
192.168.31.111 node1
192.168.31.112 node2
EOF
在各个服务器上加上网络处理方面的内核参数
sudo modprobe br_netfilter
echo 'br_netfilter' | sudo tee /etc/modules-load.d/br_netfilter.conf
cat <<EOF | sudo tee -a /etc/sysctl.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl -p
服务器都要关掉swap。
如果free -m
看到swap不为0,就需要关swap:
sudo swapoff -a
还要在/etc/fstab文件注释掉swap条目,不然重启服务器又会开启
准备工作完成
之后还需要安装:
由 kubeadm 创建的 Kubernetes 集群依赖于使用内核特性的相关软件。
这些软件包括但不限于容器运行时、
kubelet和容器网络接口(CNI)插件。
容器运行时
需要在三个服务器都安装
运行时官方说有这三个
运行时 | Unix 域套接字 |
---|---|
containerd | unix:///var/run/containerd/containerd.sock |
CRI-O | unix:///var/run/crio/crio.sock |
Docker | Engine(使用 cri-dockerd) unix:///var/run/cri-dockerd.sock |
日常用的是docker,而且有些小服务也是用docker启动的,所以继续先用docker
安装docker
使用清华源镜像进行安装
sudo apt install curl -y
export DOWNLOAD_URL="https://mirrors.tuna.tsinghua.edu.cn/docker-ce"
curl -fsSL https://raw.githubusercontent.com/docker/docker-install/master/install.sh | sh
以前是docker镜像多生态大。所以k8s使用dockershim来适配docker,但是Kubernetes后面也起来了,为了统一标准自 1.24 版本起已移除dockershim统一了接口,所以需要安装这个cri-dockerd来代替dockershim的功能
安装cri-dockerd:
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.20/cri-dockerd-0.3.20.amd64.tgz
tar -zxf cri-dockerd-0.3.20.amd64.tgz
cd cri-dockerd/
sudo install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/refs/heads/master/packaging/systemd/cri-docker.socket
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/refs/heads/master/packaging/systemd/cri-docker.service
sudo install cri-docker.* /etc/systemd/system
sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
sudo systemctl daemon-reload
sudo systemctl enable --now cri-docker.socket
sudo systemctl enable --now cri-docker
运行时安装完成
安装kubelet kubeadm kubectl
也需要在三个服务器都安装
配置清华源
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.34/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
sudo tee /etc/apt/sources.list.d/kubernetes.list > /dev/null <<EOF
deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.tuna.tsinghua.edu.cn/kubernetes/core:/stable:/v1.34/deb/ /
EOF
安装
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
网络插件得等初始化集群之后再装
创建集群
主节点
引导集群需要拉取核心组件镜像,查看需要的的镜像kubeadm config images list
[root 22:12:59 ~]$ kubeadm config images list
registry.k8s.io/kube-apiserver:v1.34.1
registry.k8s.io/kube-controller-manager:v1.34.1
registry.k8s.io/kube-scheduler:v1.34.1
registry.k8s.io/kube-proxy:v1.34.1
registry.k8s.io/coredns/coredns:v1.12.1
registry.k8s.io/pause:3.10.1
registry.k8s.io/etcd:3.6.4-0
镜像问题
国内一般镜像是拉不下来的,遇到镜像无法拉取的情况就需要额外处理镜像问题。
方法1:手动上传
用一台国外VPS拉取镜像,然后打包下载下来,再传上来load
获取镜像:
#拉取镜像
docker pull registry.k8s.io/kube-apiserver:v1.34.1
docker pull registry.k8s.io/kube-controller-manager:v1.34.1
docker pull registry.k8s.io/kube-scheduler:v1.34.1
docker pull registry.k8s.io/kube-proxy:v1.34.1
docker pull registry.k8s.io/coredns/coredns:v1.12.1
docker pull registry.k8s.io/pause:3.10.1
docker pull registry.k8s.io/etcd:3.6.4-0
#打包
docker save -o k8s.tar registry.k8s.io/kube-apiserver:v1.34.1 registry.k8s.io/kube-controller-manager:v1.34.1 registry.k8s.io/kube-scheduler:v1.34.1 registry.k8s.io/kube-proxy:v1.34.1 registry.k8s.io/coredns/coredns:v1.12.1 registry.k8s.io/pause:3.10.1 registry.k8s.io/etcd:3.6.4-0
apt install lrzsz -y
#下载
sz k8s.tar
下载下来打包的文件大小在528.93MB左右
上传加载镜像
sudo apt install lrzsz -y
rz
sudo docker load -i k8s.tar
方法2:搭建镜像加速
镜像加速这个不如上面直接上传来得方便,但是如果是离线内网部署,可以搭建本地仓库然后把镜像上传上去,然后按照下面进行修改来拉取
可以搭建一个镜像加速 https://evlan.cc/archives/use-docker-proxy-html.html
文章中的步骤之外只需要再下载https://raw.githubusercontent.com/dqzboy/Docker-Proxy/refs/heads/main/config/registry-k8s.yml,然后docker-compose.yaml追加以下内容
k8s:
container_name: reg-k8s
image: dqzboy/registry:latest
restart: always
environment:
- OTEL_TRACES_EXPORTER=none
#- http=http://host:port
#- https=http://host:port
volumes:
- ./registry/data:/var/lib/registry
- ./registry-k8s.yml:/etc/distribution/config.yml
#- ./htpasswd:/auth/htpasswd
ports:
- 55000:5000
执行docker compose down && docker compose up -d
就支持k8s镜像加速了
使用加速镜像
然后主节点docker配置文件/etc/docker/daemon.json
中添加insecure-registries配置项把搭建的镜像加速加入信任名单
{
"insecure-registries": [
"加速服务器IP:55000"
]
}
重启docker生效
sudo systemctl daemon-reload
sudo systemctl restart docker
因为一些原因,此时还不能指定镜像直接拉取:
在官方文档说已经有所提及: 为了保持向后兼容,默认的 registry.k8s.io 和自定义的容器镜像仓库行为可能会有所不同,例如当镜像有子路径时如: registry.k8s.io/coredns/coredns:v1.11.1 在自定义容器镜像仓库时则会变成 registry.lank8s.cn/coredns:v1.11.1, 因此这就导致了 coredns 镜像只修改全局的容器镜像仓库时会无法使用。
同时github也有讨论https://github.com/kubernetes/kubeadm/issues/2714
所以指定了镜像之后还得单独在指定一下coredns,这个就得用配置文件的方式了。
输出默认配置文件kubeadm config print init-defaults > kubeadm-init.yaml
编辑kubeadm-init.yaml文件做以下修改
localAPIEndpoint.advertiseAddress这一项改成192.168.31.110
,就是当前的IP
nodeRegistration.criSocket这一项改成unix:///var/run/cri-dockerd.sock
nodeRegistration.name这一项改成master
,就是当前主机名
imageRepository这一项改成加速服务器IP:55000
在dns项下加上子项imageRepository: 加速服务器IP:55000/coredns
,单独指定coredns
在networking项下加上子项podSubnet: 10.244.0.0/16
,这个是flannel网络插件的默认地址
Pause 镜像问题
为了符合规范,现在 Pause 镜像的配置完全由容器运行时(CRI)管理
在cri-dockerd查一下
root@master:~/cri-dockerd# cri-dockerd --help| grep image
--image-pull-progress-deadline duration If no pulling progress is made before this deadline, the image pulling will be cancelled. (default 1m0s)
--pod-infra-container-image string The image whose network/ipc namespaces containers in each pod will use (default "registry.k8s.io/pause:3.10")
发现它默认的镜像是registry.k8s.io/pause:3.10,而用加速镜像拉的镜像就会变成"加速服务器IP:55000/pause:3.10.1",而手动上传的是"registry.k8s.io/pause:3.10.1",都和默认的“registry.k8s.io/pause:3.10”对不上。所以加参数指定一下镜像:
编辑/etc/systemd/system/cri-docker.service文件
在启动命令就是ExecStart开头的那一行后面追加内容
加速服务的追加这个 --pod-infra-container-image 加速服务器IP:55000/pause:3.10.1
手动上传镜像就追加这个 --pod-infra-container-image registry.k8s.io/pause:3.10.1
样子大概是这样:
ExecStart=/usr/local/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image 加速服务器IP:55000
生效
sudo systemctl daemon-reload
sudo systemctl restart cri-docker.service
初始化集群
能直连或者手动上传镜像的就执行sudo kubeadm init --cri-socket=unix:///var/run/cri-dockerd.sock --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.34.1
搭建加速镜像用用配置文件的话就执行sudo kubeadm init --config=kubeadm-init.yaml
等一会提示成功
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.31.110:6443 --token v0nzpr.yallkskkym2ql2is \
--discovery-token-ca-cert-hash sha256:5cd2c14c5fb09418d2e38f04a5a7f1338e7fd54264636de1a294571430fe00a8
按照提示执行命令
如果不是root用户就执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
如果是root就执行export KUBECONFIG=/etc/kubernetes/admin.conf
,这个设置的环境变量,可以放到.bashrc中,不然每次登录root用户都要重新执行一遍。
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" | tee -a $HOME/.bashrc
这个是让kubectl能读到配置集群配置,好操作集群,所以这个一定要设置。
创建失败了或者想重新创建集群或者删除,执行sudo kubeadm reset --cri-socket=unix:///var/run/cri-dockerd.sock
卸载集群
这个时候的集群只有一个主节点,状态还是NotReady,需要安装网络插件
root@master:~# kubectl get node
NAME STATUS ROLES AGE VERSION
master NotReady control-plane 7m32s v1.34.0
网络插件
下载部署文件wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
这个里面也有镜像从ghcr.io拉取
root@master:~# cat kube-flannel.yml | grep image
image: ghcr.io/flannel-io/flannel:v0.27.3
image: ghcr.io/flannel-io/flannel-cni-plugin:v1.7.1-flannel1
image: ghcr.io/flannel-io/flannel:v0.27.3
如果也下载不下来就参照上面操作手动上传
然后直接执行kubectl apply -f kube-flannel.yml
flannel里面默认网段配置的是10.244.0.0/16,所以直接执行没问题,要是不一致,就得修改一下kube-flannel.yml再apply
然后等一会看看节点,就变成Ready了
root@master:~# kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane 10m v1.34.0
但这个时候还不能用,因为k8s有“污点(Taint)”的概念:污点是打在节点上的"标签",如果 Pod(容器组) 没有特别设置"容忍"这些污点,就不会被调度到该节点。也就是说干活的pod没地方部署
查看污点
root@master:~# kubectl describe node| grep Taints
Taints: node-role.kubernetes.io/control-plane:NoSchedule
去掉污点,加个减号就去掉了
kubectl taint nodes master node-role.kubernetes.io/control-plane:NoSchedule-
这个时候就是单节点的k8s了,可以部署东西了。
让deepseek写个配置部署,用于部署 3 个副本的 nginx:alpine镜像,并通过 NodePort 30080 暴露服务
NodePort的默认范围是30000-32767,NodePort也就是集群外通过服务器IP直接能访问的端口。
因为用的是docker,nginx:alpine这个镜像docker会自己去拉,要配置好docker镜像加速或者提前上传好
cat <<EOF > nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "128Mi"
cpu: "100m"
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: NodePort
selector:
app: nginx
ports:
- protocol: TCP
port: 80 # Service 内部端口
targetPort: 80 # Pod 容器端口
nodePort: 30080 # NodePort 外部访问端口
EOF
kubectl apply -f nginx-deployment.yaml
查看效果:
user@master:~$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-74b4757d66-6pzhf 1/1 Running 0 5m21s
pod/nginx-deployment-74b4757d66-7cpfc 1/1 Running 0 5m21s
pod/nginx-deployment-74b4757d66-rz4mm 1/1 Running 0 5m21s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 37m
service/nginx-service NodePort 10.97.252.151 <none> 80:30080/TCP 5m21s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 5m21s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-74b4757d66 3 3 3 5m21s
user@master:~$ curl 192.168.31.110:30080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
单节点的k8s已经可用
master一般是不干活的,后需加入子节点,恢复加上污点就执行kubectl taint nodes master node-role.kubernetes.io/control-plane:NoSchedule
子节点
字节点也需要和主节点一样处理一下镜像问题,要么手动上传镜像,要么执行 kubeadm config print join-defaults > kubeadm-join.yaml
生成配置文件用配置镜像加速。 cri-docker的Pause 镜像问题也要同步处理
先在主节点生成令牌kubeadm token create --print-join-command --ttl 1h
,默认是24小时,这里指定成1小时
user@master:~/cri-dockerd$ kubeadm token create --print-join-command --ttl 1h
kubeadm join 192.168.31.110:6443 --token 10g25k.e22gzvd7s5czpfjs --discovery-token-ca-cert-hash sha256:c2f063cb518e8c70b3dd32205368d567325b576cbf8593e1cee90013c6890724
如果是手动上传的镜像,不涉及命令行设置镜像源,直接执行命令加入集群
sudo kubeadm join 192.168.31.110:6443 --token 10g25k.e22gzvd7s5czpfjs --discovery-token-ca-cert-hash sha256:c2f063cb518e8c70b3dd32205368d567325b576cbf8593e1cee90013c6890724 --cri-socket unix:///var/run/cri-dockerd.sock
配置文件的话,未做测试。。。
user@master:~/cri-dockerd$ kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane 49m v1.34.0
node1 Ready <none> 44m v1.34.0
node2 Ready <none> 8m10s v1.34.0
再用上面的nginx-deployment.yaml来测试
user@master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane 56m v1.34.0
node1 Ready <none> 51m v1.34.0
node2 Ready <none> 15m v1.34.0
user@master:~$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-74b4757d66-qf4v9 1/1 Running 0 4m33s
pod/nginx-deployment-74b4757d66-rb4vj 1/1 Running 0 4m33s
pod/nginx-deployment-74b4757d66-s5jcn 1/1 Running 0 4m33s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 56m
service/nginx-service NodePort 10.108.55.72 <none> 80:30080/TCP 4m33s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 4m33s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-74b4757d66 3 3 3 4m33s
user@master:~$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-74b4757d66-qf4v9 1/1 Running 0 4m44s 10.244.2.2 node2 <none> <none>
nginx-deployment-74b4757d66-rb4vj 1/1 Running 0 4m44s 10.244.1.3 node1 <none> <none>
nginx-deployment-74b4757d66-s5jcn 1/1 Running 0 4m44s 10.244.1.2 node1 <none> <none>
访问三个节点的IP:30080都是一样的效果,三个节点内直接访问10.108.55.72:80也都能访问。而且可以看到pod运行再node1和node2上