分布式 key-value 存储系统 etcd 的安装备忘

由于 etcd 的安装、启动等过程与官方文档所说的有些不同,这里备忘以免重复采坑。

Github 地址:https://github.com/coreos/etcd

这里介绍集群的启动方式,假设我们有两台机器:

机器一:192.168.33.10
机器二:192.168.33.11

注:两台机器无需建立 SSH 互信

1. 下载

从 Github release 页面下载最新版本:Releases · coreos/etcd · GitHub

这里下载当前( 2018 年 07 月 01 日 )最新版本,Linux amd64:https://github.com/coreos/etc...

解压并进入目录:

tar -zvxf etcd-v3.3.8-linux-amd64.tar.gz

cd etcd-v3.3.8-linux-amd64

注:集群中的每一台机器都需进行该操作。

2. 启动

官方文档:Install etcd | Get Started with etcd | CoreOS

根据官网文档的介绍,我们可以使用参数或配置文件启动,以下分别介绍这两种方式。

2.1 通过参数启动

注:以下使用 nohup,使其以后台方式运行

2.1.1 在 192.168.33.10 中启动

nohup ./etcd --name my-etcd-1 \
--listen-client-urls http://192.168.33.10:2379 \
--advertise-client-urls http://192.168.33.10:2379 \
--listen-peer-urls http://192.168.33.10:2380 \
--initial-advertise-peer-urls http://192.168.33.10:2380 \
--initial-cluster my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380 \
--initial-cluster-token my-etcd-token \
--initial-cluster-state new \
>/dev/null 2>&1 &

2.1.2 在 192.168.33.11 中启动

nohup ./etcd --name my-etcd-2 \
--listen-client-urls http://192.168.33.11:2379 \
--advertise-client-urls http://192.168.33.11:2379 \
--listen-peer-urls http://192.168.33.11:2380 \
--initial-advertise-peer-urls http://192.168.33.11:2380 \
--initial-cluster my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380 \
--initial-cluster-token my-etcd-token \
--initial-cluster-state new \
>/dev/null 2>&1 &

2.2 通过配置文件启动

配置文件参考:Install etcd | Get Started with etcd | CoreOSetcd/etcd.conf.yml.sample at master · coreos/etcd · GitHub

我们将配置文件命名为:etcd.conf.yml,并将其置于与 etcd-v3.3.8-linux-amd64 同级目录下。

2.2.1 在 192.168.33.10 中添加配置文件并启动

vim ./etcd.conf.yml

内容为:

name:                        my-etcd-1
listen-client-urls:          http://192.168.33.10:2379
advertise-client-urls:       http://192.168.33.10:2379
listen-peer-urls:            http://192.168.33.10:2380
initial-advertise-peer-urls: http://192.168.33.10:2380
initial-cluster:             my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380
initial-cluster-token:       my-etcd-token
initial-cluster-state:       new

然后执行:

nohup ./etcd --config-file ./etcd.conf.yml >/dev/null 2>&1 &

2.2.2 在 192.168.33.11 中添加配置文件并启动

vim ./etcd.conf.yml
name:                        my-etcd-2
listen-client-urls:          http://192.168.33.11:2379
advertise-client-urls:       http://192.168.33.11:2379
listen-peer-urls:            http://192.168.33.11:2380
initial-advertise-peer-urls: http://192.168.33.11:2380
initial-cluster:             my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380
initial-cluster-token:       my-etcd-token
initial-cluster-state:       new

然后执行:

nohup ./etcd --config-file ./etcd.conf.yml >/dev/null 2>&1 &

3. 检测启动

按照官方文档所说,我们可以在集群的任意一台节点上,通过执行如下指令检测集群的运行情况:

# 仍然在 etcd-v3.3.8-linux-amd64 目录下

./etcdctl cluster-health

然而,当执行这句指令后,我们会得到如下信息:

[vagrant@192-168-33-10 etcd-v3.3.8-linux-amd64]$ ./etcdctl cluster-health
cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

这个报错很奇怪,启动时明明指定的不是 127.0.0.1。

查阅了一些资料,最终在 Github issue 中找到了问题所在:

I faced the same situation. After searching, I find etcdctl cmd use default endpoints "http://127.0.0.1:2379,http://127.0.0.1:4001". If etcd start with --listen-client-urls http://HOST_IP:2379, then you can use etcdctl like etcdctl --endpoints 'http://HOST_IP:2379' member list

这个 issue 中提供了多种方式,个人感觉最好的解决方式是通过追加 --endopoints 参数:

./etcdctl --endpoints http://192.168.33.10:2379,http://192.168.33.11:2379 cluster-health

当然,--endpoints 参数的值可以只添加其中一台机器,如:

./etcdctl --endpoints http://192.168.33.10:2379 cluster-health

细节可以参考:Etcd2 cluster working but etcdctl broken · Issue #1028 · coreos/bugs · GitHub

这样我们就能发现,etcd 集群已经成功启动了:

[vagrant@192-168-33-10 etcd-v3.3.8-linux-amd64]$ ./etcdctl --endpoints http://192.168.33.10:2379 cluster-health
member 42ab269b4f75b118 is healthy: got healthy result from http://192.168.33.11:2379
member 7118e8ab00eced36 is healthy: got healthy result from http://192.168.33.10:2379
cluster is healthy

当然,我们也可以添加 member list 指令查看:

[vagrant@192-168-33-11 etcd-v3.3.8-linux-amd64]$ ./etcdctl --endpoints http://192.168.33.10:2379 member list
42ab269b4f75b118: name=my-etcd-2 peerURLs=http://192.168.33.11:2380 clientURLs=http://192.168.33.11:2379 isLeader=true
7118e8ab00eced36: name=my-etcd-1 peerURLs=http://192.168.33.10:2380 clientURLs=http://192.168.33.10:2379 isLeader=false

参考链接

  1. linux重定向及nohup不输出的方法 - CSDN博客
  2. etcd/etcd.conf.yml.sample at master · coreos/etcd · GitHub
  3. Etcd2 cluster working but etcdctl broken · Issue #1028 · coreos/bugs · GitHub

相关推荐