旧的安装记录更新 Kubeadm 安装 k8s 集群指南
系统环境基于:Rocky Linux 9.4 版本 x86_64
部署环境为三台 Master 两台 Node
时间同步 Centos 7 以上的系统改为使用 chrony
来同步时间,之前的老的 ntp
不在使用
默认系统自带 chrony
如果不存在使用下面的命令直接安装即可
修改配置文件,指定集群中某台机器扮演服务器角色,配置文件:/etc/chrony.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 # Use public servers from the pool.ntp.org project. # Please consider joining the pool (https://www.pool.ntp.org/join.html). pool 2.rocky.pool.ntp.org iburst # 默认,同步的服务池 # 客户端应该手动加入服务端的地址或者域名,服务端不需要 server time.neu.edu.cn iburst # 手动添加 iburst 表示加急 server time.windows.com iburst # 手动添加 # Use NTP servers from DHCP. sourcedir /run/chrony-dhcp # Record the rate at which the system clock gains/losses time. driftfile /var/lib/chrony/drift # Allow the system clock to be stepped in the first three updates # if its offset is larger than 1 second. makestep 1.0 3 # Enable kernel synchronization of the real-time clock (RTC). rtcsync # Enable hardware timestamping on all interfaces that support it. #hwtimestamp * # Increase the minimum number of selectable sources required to adjust # the system clock. #minsources 2 # Allow NTP client access from local network. # 修改此项打开注释 设置允许访问的 ip 地址段,或者直接 写 allow all # 设置为允许后,表示此台服务器扮演服务器角色 #allow 192.168.0.0/16 # Serve time even if not synchronized to a time source. #local stratum 10 # Require authentication (nts or key option) for all NTP sources. #authselectmode require # Specify file containing keys for NTP authentication. keyfile /etc/chrony.keys # Save NTS keys and cookies. ntsdumpdir /var/lib/chrony # Insert/delete leap seconds by slewing instead of stepping. #leapsecmode slew # Get TAI-UTC offset and leap seconds from the system tz database. leapsectz right/UTC # Specify directory for log files. logdir /var/log/chrony # Select which information is logged. #log measurements statistics tracking
初始化配置
网络畅通
每个节点都有自己的唯一名称,并且保证 mac
地址唯一
每个节点的内存不得小于 2GB 处理器数量不小于 2核
每个节点防火墙关闭,如果不能关闭需要放行 6443
端口
nc 127.0.0.1 6443 -v
检查命令
每个节点的交换内存必须关闭,防止意外情况
关闭 SeLinux 配置
以上操作完成之后,必须重启
关闭交换内存 修改 /etc/fstab
文件,注销 swap
分区
关闭 SeLinux 修改配置文件 /etc/sysconfig/selinux
,修改 SELINUX=disabled
开启网桥的iptables
过滤设置 编辑配置文件 vi /etc/sysctl.conf
1 2 3 4 5 6 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 modprobe br_netfilter
执行下面的命令生效
1 2 3 sysctl -p echo modprobe br_netfilter >> /etc/rc.d/rc.localchmod +x /etc/rc.d/rc.local
安装容器 docker
仓库地址更换国内
1 2 3 yum install -y yum-utils yum-config-manager --add-repo https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repo sed -i -e 's/download.docker.com/mirrors.ustc.edu.cn\/docker-ce/g' /etc/yum.repos.d/docker-ce.repo
安装 docker
1 sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
安装 docker 的 cri(自 1.24.0 版本以后 k8s 就不带这个了)
cri
运行时安装,需要去仓库 release 中下载二进制文件,下载仓库中 packaging/systemd/
下的所有文件和二进制文件放在同一个文件夹中,切换到文件夹中,执行下面的命令。
1 2 3 4 5 install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd && \ install packaging/systemd/* /etc/systemd/system && \ sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service && \ systemctl daemon-reload && \ systemctl enable --now cri-docker.socket
安装 kubeadm, kubelet 和 kubectl 添加镜像源地址 官方文档给出的配置如下
1 2 3 4 5 6 7 8 9 cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni EOF
由于国内环境替换镜像源地址为中科大的地址
1 2 3 4 5 6 7 8 9 cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.30/rpm/ enabled=1 gpgcheck=1 gpgkey=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.30/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni EOF
安装 所有节点都需要安装
1 yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
--disableexcludes=kubernetes
:表示临时禁用相关排除规则,保证安装顺利,如果机器上没有设置排除规则,不追加此参数也行。
启动 kubelet 必须在运行kubeadm
之前运行 kubelet
服务
1 systemctl enable --now kubelet
初始化<任意主节点> 准备好 etcd
集群和相关证书,开始初始化
获取默认的配置文件进行修改
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 [root@k8s-master1 kubeadm]# cat init-defaults.yaml apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups : - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.0.210 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/cri-dockerd.sock imagePullPolicy: IfNotPresent name: master1 taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: external: endpoints: - https://192.168.0.210:2379 - https://192.168.0.211:2379 - https://192.168.0.212:2379 - https://192.168.0.213:2379 - https://192.168.0.214:2379 caFile: /root/etcd/cert/ca.pem certFile: /root/etcd/cert/etcd.pem keyFile: /root/etcd/cert/etcd-key.pem imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: 1.30.0 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 podSubnet: 10.244.0.0/16 scheduler: {} [root@k8s-master1 kubeadm]#
查看需要下载的镜像列表
1 kubeadm config images list --config init-defaults.yaml
开始拉取镜像
1 2 3 4 kubeadm config images pull --config init-defaults.yaml docker pull registry.aliyuncs.com/google_containers/pause:3.9 && docker tag registry.aliyuncs.com/google_containers/pause:3.9 registry.k8s.io/pause:3.9
开始初始化
1 kubeadm init --config init-defaults.yaml
初始化失败重置
1 2 3 4 5 6 7 8 9 kubeadm reset --cri-socket=unix:///var/run/cri-dockerd.sock && \ iptables -F && \ iptables -X && \ ipvsadm -C && \ rm -rf /etc/cni/net.d && \rm -rf $HOME /.kube/config etcdctl del "" --prefix
初始化成功提示
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME /.kube sudo cp -i /etc/kubernetes/admin.conf $HOME /.kube/config sudo chown $(id -u):$(id -g) $HOME /.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.0.210:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:80bd4285e542c4761a180c4f2886d073b20870f98feecaba0680afe35e574c22 [root@k8s-master1 kubeadm]#
根据提示,在其他节点执行加入集群操作
注意其他节点必须运行 kubelet 才可以
加入集群<其他节点> 执行主节点初始化成功提示的加入命令,追加使用的 cri
1 2 kubeadm join 192.168.0.210:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:80bd4285e542c4761a180c4f2886d073b20870f98feecaba0680afe35e574c22 --cri-socket=unix:///var/run/cri-dockerd.sock
成功加入提示
1 2 3 4 5 6 7 8 This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster. [root@k8s-master2 run]#
验证集群状态 执行提示的命令,查看集群节点状态,初始状态都是 NotReady
为正常。因为集群的网络插件还没有配置,节点之间的通信还存在问题。
1 2 3 4 5 6 7 8 [root@k8s-master1 kubeadm]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master2 NotReady <none> 53s v1.30.1 k8s-master3 NotReady <none> 28s v1.30.1 k8s-node1 NotReady <none> 19s v1.30.1 k8s-node2 NotReady <none> 9s v1.30.1 master1 NotReady control-plane 9m38s v1.30.1 [root@k8s-master1 kubeadm]#
安装 Helm 根据官网文档在 Github 下载二进制文件,加入系统环境变量即可。
官网:https://helm.sh/
安装 calico 直接打开官方仓库地址:https://github.com/projectcalico/calico
切换 Tag 分支到对应的版本,在仓库文件中寻找 manifests/calico.yaml
文件,点击 raw
,复制链接在服务器上下载
1 https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml
下载下来之后修改配置,找到配置项CALICO_IPV4POOL_CIDR
也就是集群的 podSubnet
地址设置。
1 2 3 4 - name: CALICO_IPV4POOL_CIDR value: "10.244.0.0/16" - name: IP_AUTODETECTION_METHOD value: "interface=ens192"
关于 podSubnet 设置有以下几种方式
podSubnet 可以在使用 kubeadm init 初始化集群的时候使用 –pod-network-cidr=192.168.0.0/16 指定
修改 kubeadm config 导出的集群初始化配置文件中:networking.podSubnet 的值,不存在该字段就手动添加上。
如果错过了初始化,可以直接修改集群中的 kubeadm-config,使用命令:kubectl edit configmap kubeadm-config -n kube-system -o yaml,找到 networking 添加字段 podSubnet 如果存在就修改该值。
如果使用 kubeadm-config 修改配置,改完之后需要重启集群。(重启所有机器)
查找配置文件中所有的镜像,在所有节点下载拉取
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [root@k8s-master1 calico]# cat calico.yaml | grep image: image: docker.io/calico/cni:v3.28.0 image: docker.io/calico/cni:v3.28.0 image: docker.io/calico/node:v3.28.0 image: docker.io/calico/node:v3.28.0 image: docker.io/calico/kube-controllers:v3.28.0 [root@k8s-master1 calico]# docker pull docker.io/calico/cni:v3.28.0 && docker pull docker.io/calico/node:v3.28.0 && docker pull docker.io/calico/kube-controllers:v3.28.0 docker save -o calico_kube-controllers_v3.28.0.tar calico/kube-controllers:v3.28.0 docker save -o calico_cni_v3.28.0.tar calico/cni:v3.28.0 docker save -o calico_node_v3.28.0.tar calico/node:v3.28.0 docker load -i calico_cni_v3.28.0.tar && docker load -i calico_kube-controllers_v3.28.0.tar && docker load -i calico_node_v3.28.0.tar
应用创建
1 kubectl create -f calico.yaml
检查集群状态 获取 pod
状态
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [root@k8s-master1 calico]# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-564985c589-j78p6 1/1 Running 0 45s kube-system calico-node-5v65n 1/1 Running 0 45s kube-system calico-node-b7jgm 1/1 Running 0 45s kube-system calico-node-hkwrg 1/1 Running 0 45s kube-system calico-node-hsq6j 1/1 Running 0 45s kube-system calico-node-m5lhn 1/1 Running 0 46s kube-system coredns-7b5944fdcf-7pb8q 1/1 Running 0 69m kube-system coredns-7b5944fdcf-dgwpn 1/1 Running 0 69m kube-system kube-apiserver-master1 1/1 Running 7 (10m ago) 74m kube-system kube-controller-manager-master1 1/1 Running 5 (10m ago) 74m kube-system kube-proxy-57r2c 1/1 Running 1 (10m ago) 65m kube-system kube-proxy-7vw55 1/1 Running 1 (10m ago) 65m kube-system kube-proxy-gmr7z 1/1 Running 1 (10m ago) 65m kube-system kube-proxy-rwqsn 1/1 Running 1 (10m ago) 69m kube-system kube-proxy-xt7zv 1/1 Running 1 (10m ago) 65m kube-system kube-scheduler-master1 1/1 Running 6 (10m ago) 74m
获取节点状态
1 2 3 4 5 6 7 8 [root@k8s-master1 calico]# kubectl get node NAME STATUS ROLES AGE VERSION k8s-master2 Ready <none> 66m v1.30.1 k8s-master3 Ready <none> 65m v1.30.1 k8s-node1 Ready <none> 65m v1.30.1 k8s-node2 Ready <none> 65m v1.30.1 master1 Ready control-plane 74m v1.30.1 [root@k8s-master1 calico]#
设置集群角色
1 2 3 4 5 6 7 8 9 10 kubectl taint node master1 node-role.kubernetes.io/master=true :NoSchedule && \ kubectl taint node k8s-master2 node-role.kubernetes.io/master=true :PreferNoSchedule && \ kubectl taint node k8s-master3 node-role.kubernetes.io/master=true :PreferNoSchedule && \ kubectl label node k8s-node1 node-role.kubernetes.io/node= && \ kubectl label node k8s-node2 node-role.kubernetes.io/node= kubectl taint nodes k8s-node1 node-role.kubernetes.io/node=true :NoExecute- kubectl describe node k8s-node1 | grep Taints