Kubeadm 安装 Kubernetes 1.30.0

旧的安装记录更新 Kubeadm 安装 k8s 集群指南

系统环境基于:Rocky Linux 9.4 版本 x86_64

部署环境为三台 Master 两台 Node

时间同步

Centos 7 以上的系统改为使用 chrony 来同步时间,之前的老的 ntp 不在使用

默认系统自带 chrony 如果不存在使用下面的命令直接安装即可

1
yum install chrony -y

修改配置文件,指定集群中某台机器扮演服务器角色,配置文件:/etc/chrony.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (https://www.pool.ntp.org/join.html).
pool 2.rocky.pool.ntp.org iburst # 默认,同步的服务池
# 客户端应该手动加入服务端的地址或者域名,服务端不需要
server time.neu.edu.cn iburst # 手动添加 iburst 表示加急
server time.windows.com iburst # 手动添加

# Use NTP servers from DHCP.
sourcedir /run/chrony-dhcp

# Record the rate at which the system clock gains/losses time.
driftfile /var/lib/chrony/drift

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Enable kernel synchronization of the real-time clock (RTC).
rtcsync

# Enable hardware timestamping on all interfaces that support it.
#hwtimestamp *

# Increase the minimum number of selectable sources required to adjust
# the system clock.
#minsources 2

# Allow NTP client access from local network.
# 修改此项打开注释 设置允许访问的 ip 地址段,或者直接 写 allow all
# 设置为允许后,表示此台服务器扮演服务器角色
#allow 192.168.0.0/16

# Serve time even if not synchronized to a time source.
#local stratum 10

# Require authentication (nts or key option) for all NTP sources.
#authselectmode require

# Specify file containing keys for NTP authentication.
keyfile /etc/chrony.keys

# Save NTS keys and cookies.
ntsdumpdir /var/lib/chrony

# Insert/delete leap seconds by slewing instead of stepping.
#leapsecmode slew

# Get TAI-UTC offset and leap seconds from the system tz database.
leapsectz right/UTC

# Specify directory for log files.
logdir /var/log/chrony

# Select which information is logged.
#log measurements statistics tracking

初始化配置

  • 网络畅通
  • 每个节点都有自己的唯一名称,并且保证 mac地址唯一
  • 每个节点的内存不得小于 2GB 处理器数量不小于 2核
  • 每个节点防火墙关闭,如果不能关闭需要放行 6443 端口
    • nc 127.0.0.1 6443 -v 检查命令
  • 每个节点的交换内存必须关闭,防止意外情况
  • 关闭 SeLinux 配置
  • 以上操作完成之后,必须重启

关闭交换内存

修改 /etc/fstab 文件,注销 swap 分区

关闭 SeLinux

修改配置文件 /etc/sysconfig/selinux,修改 SELINUX=disabled

开启网桥的iptables过滤设置

编辑配置文件 vi /etc/sysctl.conf

1
2
3
4
5
6
# 桥接网络的 ipv6 数据包开启过滤
net.bridge.bridge-nf-call-ip6tables = 1
# 桥接网络的 ipv4 数据包开启过滤
net.bridge.bridge-nf-call-iptables = 1
# 允许桥接的网络流量经过内核的 netfilter 子系统进行过滤和处理
modprobe br_netfilter

执行下面的命令生效

1
2
3
sysctl -p
echo modprobe br_netfilter >> /etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local

安装容器

docker仓库地址更换国内

1
2
3
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repo
sed -i -e 's/download.docker.com/mirrors.ustc.edu.cn\/docker-ce/g' /etc/yum.repos.d/docker-ce.repo

安装 docker

1
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y

安装 docker 的 cri(自 1.24.0 版本以后 k8s 就不带这个了)

cri 运行时安装,需要去仓库 release 中下载二进制文件,下载仓库中 packaging/systemd/ 下的所有文件和二进制文件放在同一个文件夹中,切换到文件夹中,执行下面的命令。

1
2
3
4
5
install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd && \
install packaging/systemd/* /etc/systemd/system && \
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service && \
systemctl daemon-reload && \
systemctl enable --now cri-docker.socket

安装 kubeadm, kubelet 和 kubectl

添加镜像源地址

官方文档给出的配置如下

1
2
3
4
5
6
7
8
9
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF

由于国内环境替换镜像源地址为中科大的地址

1
2
3
4
5
6
7
8
9
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.30/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.30/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF

安装

所有节点都需要安装

1
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

--disableexcludes=kubernetes:表示临时禁用相关排除规则,保证安装顺利,如果机器上没有设置排除规则,不追加此参数也行。

启动 kubelet

必须在运行kubeadm之前运行 kubelet服务

1
systemctl enable --now kubelet

初始化<任意主节点>

准备好 etcd集群和相关证书,开始初始化

获取默认的配置文件进行修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
[root@k8s-master1 kubeadm]# cat init-defaults.yaml 
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# advertiseAddress: 1.2.3.4
advertiseAddress: 192.168.0.210 # master 节点的 ip
bindPort: 6443
nodeRegistration:
# criSocket: unix:///var/run/containerd/containerd.sock
criSocket: unix:///var/run/cri-dockerd.sock # 替换为 docker 的 cri
imagePullPolicy: IfNotPresent
# name: node
name: master1 # master 节点的主机名,必须 hosts 文件解析
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
# 使用外部 etcd 集群
etcd:
external:
endpoints:
- https://192.168.0.210:2379
- https://192.168.0.211:2379
- https://192.168.0.212:2379
- https://192.168.0.213:2379
- https://192.168.0.214:2379
caFile: /root/etcd/cert/ca.pem
certFile: /root/etcd/cert/etcd.pem
keyFile: /root/etcd/cert/etcd-key.pem
# etcd:
# local:
# dataDir: /var/lib/etcd
# imageRepository: registry.k8s.io
imageRepository: registry.aliyuncs.com/google_containers # 切换镜像仓库下载源为国内阿里的
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16 # 追加字段设置,设置 pod 子网网段 后续 calico 中的配置需要和这个一样
scheduler: {}
[root@k8s-master1 kubeadm]#

查看需要下载的镜像列表

1
kubeadm config images list --config init-defaults.yaml

开始拉取镜像

1
2
3
4
kubeadm config images pull --config init-defaults.yaml

# 本笔记安装需要额外重命名镜像
docker pull registry.aliyuncs.com/google_containers/pause:3.9 && docker tag registry.aliyuncs.com/google_containers/pause:3.9 registry.k8s.io/pause:3.9

开始初始化

1
kubeadm init --config init-defaults.yaml

初始化失败重置

1
2
3
4
5
6
7
8
9
kubeadm reset --cri-socket=unix:///var/run/cri-dockerd.sock && \
iptables -F && \
iptables -X && \
ipvsadm -C && \
rm -rf /etc/cni/net.d && \
rm -rf $HOME/.kube/config

# 还需要清空 etcd 数据
etcdctl del "" --prefix

初始化成功提示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.210:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:80bd4285e542c4761a180c4f2886d073b20870f98feecaba0680afe35e574c22
[root@k8s-master1 kubeadm]#

根据提示,在其他节点执行加入集群操作

注意其他节点必须运行 kubelet 才可以

加入集群<其他节点>

执行主节点初始化成功提示的加入命令,追加使用的 cri

1
2
kubeadm join 192.168.0.210:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:80bd4285e542c4761a180c4f2886d073b20870f98feecaba0680afe35e574c22 --cri-socket=unix:///var/run/cri-dockerd.sock

成功加入提示

1
2
3
4
5
6
7
8

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

[root@k8s-master2 run]#

验证集群状态

执行提示的命令,查看集群节点状态,初始状态都是 NotReady 为正常。因为集群的网络插件还没有配置,节点之间的通信还存在问题。

1
2
3
4
5
6
7
8
[root@k8s-master1 kubeadm]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master2 NotReady <none> 53s v1.30.1
k8s-master3 NotReady <none> 28s v1.30.1
k8s-node1 NotReady <none> 19s v1.30.1
k8s-node2 NotReady <none> 9s v1.30.1
master1 NotReady control-plane 9m38s v1.30.1
[root@k8s-master1 kubeadm]#

安装 Helm

根据官网文档在 Github 下载二进制文件,加入系统环境变量即可。

官网:https://helm.sh/

安装 calico

直接打开官方仓库地址:https://github.com/projectcalico/calico

切换 Tag 分支到对应的版本,在仓库文件中寻找 manifests/calico.yaml 文件,点击 raw,复制链接在服务器上下载

1
https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml

下载下来之后修改配置,找到配置项CALICO_IPV4POOL_CIDR 也就是集群的 podSubnet 地址设置。

1
2
3
4
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
- name: IP_AUTODETECTION_METHOD
value: "interface=ens192"

关于 podSubnet 设置有以下几种方式

  • podSubnet 可以在使用 kubeadm init 初始化集群的时候使用 –pod-network-cidr=192.168.0.0/16 指定
  • 修改 kubeadm config 导出的集群初始化配置文件中:networking.podSubnet 的值,不存在该字段就手动添加上。
  • 如果错过了初始化,可以直接修改集群中的 kubeadm-config,使用命令:kubectl edit configmap kubeadm-config -n kube-system -o yaml,找到 networking 添加字段 podSubnet 如果存在就修改该值。

如果使用 kubeadm-config 修改配置,改完之后需要重启集群。(重启所有机器)

查找配置文件中所有的镜像,在所有节点下载拉取

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@k8s-master1 calico]# cat calico.yaml | grep image:
image: docker.io/calico/cni:v3.28.0
image: docker.io/calico/cni:v3.28.0
image: docker.io/calico/node:v3.28.0
image: docker.io/calico/node:v3.28.0
image: docker.io/calico/kube-controllers:v3.28.0
[root@k8s-master1 calico]#

# 命令
docker pull docker.io/calico/cni:v3.28.0 && docker pull docker.io/calico/node:v3.28.0 && docker pull docker.io/calico/kube-controllers:v3.28.0

# 导出命令
docker save -o calico_kube-controllers_v3.28.0.tar calico/kube-controllers:v3.28.0
docker save -o calico_cni_v3.28.0.tar calico/cni:v3.28.0
docker save -o calico_node_v3.28.0.tar calico/node:v3.28.0

# 导入命令
docker load -i calico_cni_v3.28.0.tar && docker load -i calico_kube-controllers_v3.28.0.tar && docker load -i calico_node_v3.28.0.tar

应用创建

1
kubectl create -f calico.yaml 

检查集群状态

获取 pod 状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@k8s-master1 calico]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-564985c589-j78p6 1/1 Running 0 45s
kube-system calico-node-5v65n 1/1 Running 0 45s
kube-system calico-node-b7jgm 1/1 Running 0 45s
kube-system calico-node-hkwrg 1/1 Running 0 45s
kube-system calico-node-hsq6j 1/1 Running 0 45s
kube-system calico-node-m5lhn 1/1 Running 0 46s
kube-system coredns-7b5944fdcf-7pb8q 1/1 Running 0 69m
kube-system coredns-7b5944fdcf-dgwpn 1/1 Running 0 69m
kube-system kube-apiserver-master1 1/1 Running 7 (10m ago) 74m
kube-system kube-controller-manager-master1 1/1 Running 5 (10m ago) 74m
kube-system kube-proxy-57r2c 1/1 Running 1 (10m ago) 65m
kube-system kube-proxy-7vw55 1/1 Running 1 (10m ago) 65m
kube-system kube-proxy-gmr7z 1/1 Running 1 (10m ago) 65m
kube-system kube-proxy-rwqsn 1/1 Running 1 (10m ago) 69m
kube-system kube-proxy-xt7zv 1/1 Running 1 (10m ago) 65m
kube-system kube-scheduler-master1 1/1 Running 6 (10m ago) 74m

获取节点状态

1
2
3
4
5
6
7
8
[root@k8s-master1 calico]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master2 Ready <none> 66m v1.30.1
k8s-master3 Ready <none> 65m v1.30.1
k8s-node1 Ready <none> 65m v1.30.1
k8s-node2 Ready <none> 65m v1.30.1
master1 Ready control-plane 74m v1.30.1
[root@k8s-master1 calico]#

设置集群角色

  • NoSchedule: 一定不能被调度,已存在的 pod 不会被驱逐

  • PreferNoSchedule: 尽量不要调度

  • NoExecute: 一定不能被调度, 还会驱逐 node 上已有的 pod

1
2
3
4
5
6
7
8
9
10
kubectl taint node master1 node-role.kubernetes.io/master=true:NoSchedule && \
kubectl taint node k8s-master2 node-role.kubernetes.io/master=true:PreferNoSchedule && \
kubectl taint node k8s-master3 node-role.kubernetes.io/master=true:PreferNoSchedule && \
kubectl label node k8s-node1 node-role.kubernetes.io/node= && \
kubectl label node k8s-node2 node-role.kubernetes.io/node=

# 取消不可调度污点
kubectl taint nodes k8s-node1 node-role.kubernetes.io/node=true:NoExecute-
# 查看节点污点
kubectl describe node k8s-node1 | grep Taints