Etcd 集群配置

本文主要介绍etcd的两种集群部署方式

  • 普通集群部署
  • TLS集群部署

etcd采用二进制启动服务运行,所以我们需要事先下载好程序

不同的系统安装方式不同,具体参考Release界面的提示信息

下载地址: 点我直达

笔记使用的配置机器为三台,信息分别为

1
2
3
192.168.5.100 etcd0
192.168.5.101 etcd1
192.168.5.102 etcd2

启动参数释义

因为etcd 是二进制程序, 所以启动时需要各种传入参数, 这里进行汇总释义

参数 释义
–name etcd0 本member的名字
–initial-advertise-peer-urls http://node1_ip:2380 其他member使用,其他member通过该地址与本member交互信息。一定要保证从其他member能可访问该地址。静态配置方式下,该参数的value一定要同时在–initial-cluster参数中存在。memberID的生成受–initial-cluster-token和–initial-advertise-peer-urls影响。
–listen-peer-urls http://node1_ip:2380 本member侧使用,用于监听其他member发送信息的地址。ip为全0代表监听本member侧所有接口
–listen-client-urls http://node1_ip:2379,http://127.0.0.1:2379 本member侧使用,用于监听etcd客户发送信息的地址。ip为全0代表监听本member侧所有接口
–advertise-client-urls http://node1_ip:2379 etcd客户使用,客户通过该地址与本member交互信息。一定要保证从客户侧能可访问该地址
–initial-cluster-token etcd-cluster-1 用于区分不同集群。本地如有多个集群要设为不同。
–initial-cluster etcd0=http://node1_ip:2380,etcd1=http://node2_ip:2380... 本member侧使用。描述集群中所有节点的信息,本member根据此信息去联系其他member。memberID的生成受–initial-cluster-token和–initial-advertise-peer-urls影响。
–initial-cluster-state new 用于指示本次是否为新建集群。有两个取值new和existing。如果填为existing,则该member启动时会尝试与其他member交互。集群初次建立时,要填为new,经尝试最后一个节点填existing也正常,其他节点不能填为existing。集群运行过程中,一个member故障后恢复时填为existing,经尝试填为new也正常。
–cert-file= 指定 client.pem 文件位置
–key-file= 指定 client-key.pem 文件位置
–peer-client-cert-auth –peer-trusted-ca-file= 指定 ca.pem 文件位置
–client-cert-auth –trusted-ca-file= 指定 ca.pem 文件位置
–peer-cert-file= 指定 peer.pem 文件位置
–peer-key-file= 指定 peer-key.pem 文件位置
–data-dir= 指定数据目录位置

备注: 创建 tls集群时可以把 client peer证书设置为同一个证书,具体参考笔记中的示例

普通集群

创建

同一个集群设置相同的token

1
2
3
4
5
6
7
8
# etcd0 机器上执行
etcd --name etcd0 --initial-advertise-peer-urls http://192.168.5.100:2380 \
--listen-peer-urls http://192.168.5.100:2380 \
--listen-client-urls http://192.168.5.100:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.5.100:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcd0=http://192.168.5.100:2380,etcd1=http://192.168.5.101:2380,etcd2=http://192.168.5.102:2380 \
--initial-cluster-state new
1
2
3
4
5
6
7
8
9
# etcd1 机器上执行
etcd --name etcd1 --initial-advertise-peer-urls http://192.168.5.101:2380 \
--listen-peer-urls http://192.168.5.101:2380 \
--listen-client-urls http://192.168.5.101:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.5.101:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcd0=http://192.168.5.100:2380,etcd1=http://192.168.5.101:2380,etcd2=http://192.168.5.102:2380 \
--initial-cluster-state new

1
2
3
4
5
6
7
8
9
# etcd2 机器上执行
etcd --name etcd2 --initial-advertise-peer-urls http://192.168.5.102:2380 \
--listen-peer-urls http://192.168.5.102:2380 \
--listen-client-urls http://192.168.5.102:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.5.102:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcd0=http://192.168.5.100:2380,etcd1=http://192.168.5.101:2380,etcd2=http://192.168.5.102:2380 \
--initial-cluster-state new

验证状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
alias etcdctl='etcdctl --endpoints=http://192.168.5.100:2379,http://192.168.5.101:2379,http://192.168.5.102:2379'
# 记得设置etcd可执行程序的PATH查找
export PATH=$PATH:/root/etcd

# 查询节点状态 etcdctl endpoint status 或者追加 --write-out=table 参数以表格输出
[root@etcd0 etcd]# etcdctl endpoint status --write-out=table
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://192.168.5.100:2379 | 53a850b9eb787bd3 | 3.5.0 | 20 kB | true | false | 2 | 10 | 10 | |
| http://192.168.5.101:2379 | 90ec48128cbac2c9 | 3.5.0 | 20 kB | false | false | 2 | 10 | 10 | |
| http://192.168.5.102:2379 | cee9cd3ff7bf0095 | 3.5.0 | 20 kB | false | false | 2 | 10 | 10 | |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@etcd0 etcd]#

# 检查集群健康状态
[root@etcd0 etcd]# etcdctl endpoint health --write-out=table
+---------------------------+--------+------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+---------------------------+--------+------------+-------+
| http://192.168.5.100:2379 | true | 3.773467ms | |
| http://192.168.5.101:2379 | true | 3.987185ms | |
| http://192.168.5.102:2379 | true | 4.104202ms | |
+---------------------------+--------+------------+-------+
[root@etcd0 etcd]#

TLS集群

准备证书

需要使用cfssl生成三个证书ca.pemetcd.pemetcd-key.pem

参考笔记cfssl 创建证书中的配置生成ca的文件, 然后参考笔记中生成对等证书部分替换申请信息为下面的etcd-csr.json 示例内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// etcd-csr.json
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"192.168.5.100",
"192.168.5.101",
"192.168.5.102",
"etcd0",
"etcd1",
"etcd2"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"ST": "shanghai",
"L": "shanghai"
}
]
}

使用如下命令生成

1
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-csr.json | cfssljson -bare etcd

证书准备完成应该有如下文件:

1
2
3
4
5
6
7
8
9
-rw-r--r-- 1 root root  832 Sep 27 16:38 ca-config.json
-rw-r--r-- 1 root root 1025 Sep 27 16:49 ca.csr
-rw-r--r-- 1 root root 313 Sep 27 16:39 ca-csr.json
-rw------- 1 root root 1675 Sep 27 16:49 ca-key.pem
-rw-r--r-- 1 root root 1354 Sep 27 16:49 ca.pem # 需要
-rw-r--r-- 1 root root 517 Sep 27 16:50 etcd.csr
-rw-r--r-- 1 root root 378 Sep 27 16:43 etcd-csr.json
-rw------- 1 root root 227 Sep 27 16:50 etcd-key.pem # 需要
-rw-r--r-- 1 root root 1172 Sep 27 16:50 etcd.pem # 需要

分发证书到所有的etcd成员机器上

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@localhost etcd-cluster]# scp ca.pem etcd.pem etcd-key.pem root@192.168.5.100:/root/etcd/cert
ca.pem 100% 1354 936.7KB/s 00:00
etcd.pem 100% 1172 876.9KB/s 00:00
etcd-key.pem 100% 227 182.2KB/s 00:00
[root@localhost etcd-cluster]# scp ca.pem etcd.pem etcd-key.pem root@192.168.5.101:/root/etcd/cert
ca.pem 100% 1354 1.0MB/s 00:00
etcd.pem 100% 1172 936.3KB/s 00:00
etcd-key.pem 100% 227 303.0KB/s 00:00
[root@localhost etcd-cluster]# scp ca.pem etcd.pem etcd-key.pem root@192.168.5.102:/root/etcd/cert
ca.pem 100% 1354 1.5MB/s 00:00
etcd.pem 100% 1172 1.5MB/s 00:00
etcd-key.pem 100% 227 264.7KB/s 00:00
[root@localhost etcd-cluster]#

创建TLS集群

修改普通集群的创建命令,在末尾追加如下参数, 并修改地址从http变更为https

笔记存放证书的位置为: /root/etcd/cert

1
2
3
4
5
6
--client-cert-auth --trusted-ca-file=/root/etcd/cert/ca.pem \
--peer-client-cert-auth --peer-trusted-ca-file=/root/etcd/cert/ca.pem \
--cert-file=/root/etcd/cert/etcd.pem \
--key-file=/root/etcd/cert/etcd-key.pem \
--peer-cert-file=/root/etcd/cert/etcd.pem \
--peer-key-file=/root/etcd/cert/etcd-key.pem

下方给出完整示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# etcd0
etcd --name etcd0 --initial-advertise-peer-urls https://192.168.5.100:2380 \
--listen-peer-urls https://192.168.5.100:2380 \
--listen-client-urls https://192.168.5.100:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://192.168.5.100:2379 \
--initial-cluster-token etcd-tls-cluster-1 \
--initial-cluster etcd0=https://192.168.5.100:2380,etcd1=https://192.168.5.101:2380,etcd2=https://192.168.5.102:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/root/etcd/cert/ca.pem \
--peer-client-cert-auth --peer-trusted-ca-file=/root/etcd/cert/ca.pem \
--cert-file=/root/etcd/cert/etcd.pem \
--key-file=/root/etcd/cert/etcd-key.pem \
--peer-cert-file=/root/etcd/cert/etcd.pem \
--peer-key-file=/root/etcd/cert/etcd-key.pem

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# etcd1
etcd --name etcd1 --initial-advertise-peer-urls https://192.168.5.101:2380 \
--listen-peer-urls https://192.168.5.101:2380 \
--listen-client-urls https://192.168.5.101:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://192.168.5.101:2379 \
--initial-cluster-token etcd-tls-cluster-1 \
--initial-cluster etcd0=https://192.168.5.100:2380,etcd1=https://192.168.5.101:2380,etcd2=https://192.168.5.102:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/root/etcd/cert/ca.pem \
--peer-client-cert-auth --peer-trusted-ca-file=/root/etcd/cert/ca.pem \
--cert-file=/root/etcd/cert/etcd.pem \
--key-file=/root/etcd/cert/etcd-key.pem \
--peer-cert-file=/root/etcd/cert/etcd.pem \
--peer-key-file=/root/etcd/cert/etcd-key.pem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# etcd2
etcd --name etcd2 --initial-advertise-peer-urls https://192.168.5.102:2380 \
--listen-peer-urls https://192.168.5.102:2380 \
--listen-client-urls https://192.168.5.102:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://192.168.5.102:2379 \
--initial-cluster-token etcd-tls-cluster-1 \
--initial-cluster etcd0=https://192.168.5.100:2380,etcd1=https://192.168.5.101:2380,etcd2=https://192.168.5.102:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/root/etcd/cert/ca.pem \
--peer-client-cert-auth --peer-trusted-ca-file=/root/etcd/cert/ca.pem \
--cert-file=/root/etcd/cert/etcd.pem \
--key-file=/root/etcd/cert/etcd-key.pem \
--peer-cert-file=/root/etcd/cert/etcd.pem \
--peer-key-file=/root/etcd/cert/etcd-key.pem

验证状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 设置命令别名
alias etcdctl='etcdctl --cacert=/root/etcd/cert/ca.pem --cert=/root/etcd/cert/etcd.pem --key=/root/etcd/cert/etcd-key.pem --endpoints=https://192.168.5.100:2379,https://192.168.5.101:2379,https://192.168.5.102:2379'

# 检查节点状态
[root@etcd0 ~]# etcdctl endpoint status --write-out=table
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.5.100:2379 | f0e7b2174a652a4f | 3.5.0 | 25 kB | false | false | 5 | 15 | 15 | |
| https://192.168.5.101:2379 | 864cf652d87f0019 | 3.5.0 | 25 kB | true | false | 5 | 15 | 15 | |
| https://192.168.5.102:2379 | 7e13bf1b1bc93362 | 3.5.0 | 25 kB | false | false | 5 | 15 | 15 | |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

# 检查节点健康
[root@etcd0 ~]# etcdctl endpoint health --write-out=table
+----------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+----------------------------+--------+-------------+-------+
| https://192.168.5.102:2379 | true | 8.271539ms | |
| https://192.168.5.101:2379 | true | 11.954757ms | |
| https://192.168.5.100:2379 | true | 14.052651ms | |
+----------------------------+--------+-------------+-------+
[root@etcd0 ~]#

命令生成脚本

自动生成etcd启动命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# 集群成员IP地址信息
etcd_members = [
"192.168.5.200",
"192.168.5.201",
"192.168.5.202",
"192.168.5.203",
"192.168.5.204"
]

# 集群成员名称,对应上面的IP顺序
etcd_members_name = [
"k8s-master-1",
"k8s-master-2",
"k8s-master-3",
"k8s-node-1",
"k8s-node-2"
]

# 设置 etcd 的数据目录跟文件夹位置
# 默认的数据文件夹名为 成员名称.etcd
# 例如: 成员名: etcd-01 则按照下面的参数生成的完整路径为 /root/etcd-01.etcd
data_dir_path = "/root/"

cert_info = {
"ca": "/root/etcd/cert/ca.pem",
"private": "/root/etcd/cert/etcd-key.pem",
"public": "/root/etcd/cert/etcd.pem",
}


def generate_client_command_alias() -> str:
endpoints = ""
for i in range(len(etcd_members)):
ip = etcd_members[i]
endpoints += f"https://{ip}:2379,"
endpoints = endpoints[0:len(endpoints) - 1]
alis_command = f"""
alias etcdctl='etcdctl --cacert={cert_info["ca"]} --cert={cert_info["public"]} --key={cert_info["private"]} --endpoints={endpoints}'
"""
return alis_command


print(generate_client_command_alias())


def generate_initial_cluster() -> str:
result = ""
for i in range(len(etcd_members)):
ip = etcd_members[i]
host_name = etcd_members_name[i]
result += f"{host_name}=https://{ip}:2380,"

return result[0:len(result) - 1]


initial_cluster = generate_initial_cluster()
# 如果需要生成 systemd 的配置文件,修改这里为 True
generate_systemd_content = True

for i in range(len(etcd_members)):
ip = etcd_members[i]
host_name = etcd_members_name[i]

cmd = f"""etcd --name {host_name} --initial-advertise-peer-urls https://{ip}:2380 \\
--listen-peer-urls https://{ip}:2380 \\
--listen-client-urls https://{ip}:2379,https://127.0.0.1:2379 \\
--advertise-client-urls https://{ip}:2379 \\
--initial-cluster-token etcd-tls-cluster-1 \\
--initial-cluster {initial_cluster} \\
--initial-cluster-state new \\
--client-cert-auth --trusted-ca-file={cert_info["ca"]} \\
--peer-client-cert-auth --peer-trusted-ca-file={cert_info["ca"]} \\
--cert-file={cert_info["public"]} \\
--key-file={cert_info["private"]} \\
--peer-cert-file={cert_info["public"]} \\
--peer-key-file={cert_info["private"]} \\
--data-dir={data_dir_path}{host_name}.etcd"""

if generate_systemd_content:

cmd = f"""[Unit]
Description=etcd service
After=rc-local.service nss-user-lookup.target
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
ExecStart=/bin/bash -c "/root/etcd/{cmd}"
ExecStop=
ExecReload=
Restart=on-failure
RestartSec=42s
[Install]
WantedBy=multi-user.target"""

print(cmd, end="\n\n")


注册到systemd

这里贴出示例,可以参考上面的脚本自动生成该内容

/lib/systemd/system/下创建etcd.service

创建完成后使用下面命令来刷新并启动etcd服务

1
2
systemctl daemon-reload
systemctl start etcd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
[Unit]
Description=etcd service
After=rc-local.service nss-user-lookup.target
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
ExecStart=/bin/bash -c "/root/etcd/etcd --name k8s-master-1 --initial-advertise-peer-urls https://192.168.5.200:2380 \
--listen-peer-urls https://192.168.5.200:2380 \
--listen-client-urls https://192.168.5.200:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://192.168.5.200:2379 \
--initial-cluster-token etcd-tls-cluster-1 \
--initial-cluster k8s-master-1=https://192.168.5.200:2380,k8s-master-2=https://192.168.5.201:2380,k8s-master-3=https://192.168.5.202:2380,k8s-node-1=https://192.168.5.203:2380,k8s-node-2=https://192.168.5.204:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/root/etcd/cert/ca.pem \
--peer-client-cert-auth --peer-trusted-ca-file=/root/etcd/cert/ca.pem \
--cert-file=/root/etcd/cert/etcd.pem \
--key-file=/root/etcd/cert/etcd-key.pem \
--peer-cert-file=/root/etcd/cert/etcd.pem \
--peer-key-file=/root/etcd/cert/etcd-key.pem \
--data-dir /root/k8s-master-1.etcd "
ExecStop=
ExecReload=
Restart=on-failure
RestartSec=42s
[Install]
WantedBy=multi-user.target