ETCD node failover

Dannyvon

2019-06-30

Unreachable member

A cluster with etcd containers is created successfully.
Check the cluster status with the following command.

# etcdctl --endpoint cluster-health

If the cluster is running normally, the output looks like:

member xxx is healthy: got healthy result from https://10.23.2.109:3379
member xxx is healthy: got healthy result from https://10.23.2.108:3379
member xxx is healthy: got healthy result from https://10.23.2.110:3379
cluster is healthy

If one member failed, the output may look like:

failed to check the health of member xxx on https://10.23.2.109:3379: Get https://10.23.2.109:3379/health: dial tcp 10.23.2.109:3379: connect: connection refused
member xxx is unreachable: [https://10.23.2.109:3379] are all unreachable
member xxx is healthy: got healthy result from https://10.23.2.108:3379
member xxx is healthy: got healthy result from https://10.23.2.110:3379
cluster is healthy

The reason may meet one of the following four cases.

Case 1: The whole environment of an etcd container was destroyed.

Solution

Remove the destroyed member with etcdctl.

# etcdctl member remove xxx

xxx is memberID of the unreachable member.

Create a new etcd container with adding the following environment variables to env in config file.

"ETCD_INITIAL_CLUSTER_STATE": "existing"
"ETCD_INITIAL_CLUSTER": <The cluster peer urls with the new etcd container>

"hostname2=https://10.23.2.108:3380,hostname3=https://10.23.2.110:3380" in ETCD_INITIAL_CLUSTER are the peer urls of the cluster after removing the destroyed member.

Add the new container to the existing cluster.

# etcdctl --endpoint member add <name> <peerURL>

<name> is hostname in its config file.

<peerURL> is one of ETCD_INITIAL_ADVERTISE_PEER_URLS in its config file.

Case 2: The etcd container doesn't exist.

Solution

Add "ETCD_INITIAL_CLUSTER_STATE": "existing" to the container creation config file.
Create the container with the new config file, but keep the other configurations as same as before.

Case 3: The etcd container was stopped.

Solution

Start the container.

# docker start <container>

Case 4: The etcd service was stopped in its container.

Solution

Restart the stopped etcd container.

# docker restart <container>

Unhealthy member

If a member is unhealthy, we can refer to above case 2 to remove its container with metadata, then create a new one to fix it.

Dannyvon

0 关注 0 粉丝 0 动态

相关推荐

kubernetes(十五) kubernetes 运维

cicd方案：gitlab，build, harbor, jenkins-master-slave,helm发布到k8s集群。$ mkdir ~/binary_pkg && cd binary_pkg #提供所需的软件包。$ cd

CurrentJ 2020-08-18

kubernetes集群删除pod后长时间处于Terminating状态的案例

预生产环境，使用kubeadm部署的HA集群如下。NAME STATUS ROLES AGE VERSIONsbfk1test Ready master 37d v1.15.2sbfk2test Ready master 3

JustHaveTry 2020-07-17

不懂Kubernetes，被老板邀请爬山！

Kubernetes 已经成为容器编排领域的王者，它是基于容器的集群编排引擎，具备扩展集群、滚动升级回滚、弹性伸缩、自动治愈、服务发现等多种特性能力。本文将带着大家快速了解 Kubernetes ，了解我们谈论 Kubernetes 都是在谈论什么。功能

Dannyvon 2020-07-13

k8s 机器搭建之etcd

k8s 集群内部通过https通信的，需要签发两个证书，一个给apiserver另一个给etcd。由于是集群内部使用所以证书自己签发就可以，无需通过正规CA机构购买。证书生成工具有两种openssl 和cfssl ,这里采用的是cfssl ，cfssl 是

Dannyvon 2020-07-04

跟着炎炎盐实践k8s---Kubernetes 1.16.10 二进制高可用集群部署之ETCD部署

###host字段指定授权使用该证书的etcd节点IP或子网列表，需要将etcd集群的3个节点都添加其中。cp etcd-v3.3.13-linux-amd64/etcd* /opt/k8s/bin/

xiunai 2020-07-04

从零开始了解 Kubernetes

Kubernetes 已经成为容器编排领域的王者，它是基于容器的集群编排引擎，具备扩展集群、滚动升级回滚、弹性伸缩、自动治愈、服务发现等多种特性能力。本文将带着大家快速了解 Kubernetes ，了解我们谈论 Kubernetes 都是在谈论什么。从宏

breezegao 2020-07-02

彻底搞懂 etcd 系列文章（四）：etcd 安全

etcd 是云原生架构中重要的基础组件，由 CNCF 孵化托管。etcd 在微服务和 Kubernates 集群中不仅可以作为服务注册与发现，还可以作为 key-value 存储的中间件。etcd 支持通过 TLS 协议进行的加密通信。TLS 通道可用于对

微微一笑 2020-06-14

利用etcd实现docker跨主机通信

etcd实现分布是存储，然后让通信等数据共享。

wuxunanjing 2020-06-12

rancher2.4平台导入的k8s集群无法监控etcd解决办法

今天搭建了一个新的k8s集群，然后通过rancher平台纳管。rancher平台是一个比较好用的web页面，里面可以一键安装监控配置告警等用起来还是比较方便的。但是其它数据都可以正常收到promethues里面，唯独就没有etcd集群的数据。使用grafa

微微一笑 2020-06-12

如何配置K8S存储集群？

欢迎回到Portworx系列讲解视频。这里我们概要性的对Kubernetes和Portworx的结构进行介绍，如何在Kubernetes上配置Portworx集群，以及正确安装Portworx需要哪些命令和参数。这里我们有一组已经配置好的高可用的Kuber

CurrentJ 2020-06-06

217, k8s 总章

//黄色的是本机IP，执行脚本的这个机器IP。到这etcd 集群搭建完成！！写入分配的子网段到etcd，供flanneld使用。

wangrui0 2020-06-05

部署一套完整的Kubernetes高可用集群（二进制，最新版v1.18）下

Kubernetes作为容器集群系统，通过健康检查+重启策略实现了Pod故障自我修复能力，通过调度算法实现将Pod分布式部署，并保持预期副本数，根据Node失效状态自动在其他Node拉起Pod，实现了应用层的高可用性。针对Kubernetes集群，高可用性

lenchio 2020-06-04

etcd 性能测试与调优

etcd 是一个分布式一致性键值存储。其主要功能有服务注册与发现、消息发布与订阅、负载均衡、分布式通知与协调、分布式锁、分布式队列、集群监控与 leader 选举等。当 etcd 接收并发客户端请求时，通常平均延迟随着总体吞吐量增加而增加。etcd 使用

微微一笑 2020-06-03

etcd使用

　　etcd 是 coreOs 团队于 2013 年 6 发起的开源项目, 他的目标是构建一个高可用的分布式键值数据库. etcd 内部采用 raft 协议作为一致性算法, etcd基于 go 语言实现.　　通过心跳与其他节点同步数据.当 Follower

工作中的点点滴滴 2020-06-01

kubeadm部署1.17.3[基于Ubuntu18.04]

使用 kubeadm部署1.17.3[基于Ubuntu18.04]. # 注释 fstab 中Swap 配置。# 设置路由转发以及bridge的数据进行处理。# 所有节点创建相关目录。# 集群各 IP 对应的主机名数组。# etcd 集群服务地址列表。#

Rcvisual 2020-05-28

k8s 证书更新操作

docker ps |grep -E ‘k8s_kube-apiserver|k8s_kube-controller-manager|k8s_kube-scheduler|k8s_etcd_etcd‘ | awk -F ‘ ‘ ‘{print $1}‘ |

Dannyvon 2020-05-28

彻底搞懂 etcd 系列文章（一）：初识 etcd

etcd 是云原生架构中重要的基础组件，由 CNCF 孵化托管。etcd 在微服务和 Kubernates 集群中不仅可以作为服务注册与发现，还可以作为 key-value 存储的中间件。《彻底搞懂 etcd 系列文章》将会从 etcd 的基本功能实践、A

微微一笑 2020-05-26

etcd与Zookeeper、Consul等其它kv组件的对比

本文的主角是 etcd。“/etc” 文件夹是用于存储单个系统的配置数据的位置，而 etcd 用于存储大规模分布式的配置信息。因此，分配了 “d” 的 “/etc” 就是 “etcd”。etcd 被设计为大型分布式系统的通用基板。etcd 集群旨在提供具有

wishli 2020-05-19

etcd实现服务发现

etcd环境安装与使用文章中介绍了etcd的安装及v3 API使用，本篇将介绍如何使用etcd实现服务发现功能。比如网关代理服务时能够及时的发现服务中新增节点、丢弃不可用的服务节点。同时绑定租约，并以续租约的方式检测服务是否正常运行，从而实现健康检查。

工作中的点点滴滴 2020-05-14

kubernetes v1.18.2 二进制双栈 etcd 部署

# kube-apiserver 服务器IP 如果外部访问K8s 集群使用VIP ip 请在下面添加vip ip

cloudinyachao 2020-05-08

Dannyvon

W3CSchool教程: HTML 教程; CSS 教程; Bootstrap 教程; Javascript 教程; jQuery 教程

后端教程: C 教程; Java 教程; PHP 教程; Python 教程; Go 教程

移动开发: Android 教程; Swift 教程; Kotlin 教程; jQuery Mobile 教程; ionic 教程

关于我们: 新闻动态; 联系方式; 招聘英才; 安科实验室; 帮助与反馈

安科网(Ancii)，中国第一极客网

Copyright © 2013 - 2019 Ancii.com

京ICP备18063983号京公网安备11010802014868号