Kubernetes 滚动升级

shuixianmu

2016-10-15

Kubernetes Rolling Upgrade

背景

Kubernetes 是一个很好的容器应用集群管理工具，尤其是采用ReplicaSet这种自动维护应用生命周期事件的对象后，将容器应用管理的技巧发挥得淋漓尽致。在容器应用管理的诸多特性中，有一个特性是最能体现Kubernetes强大的集群应用管理能力的，那就是滚动升级。

滚动升级的精髓在于升级过程中依然能够保持服务的连续性，使外界对于升级的过程是无感知的。整个过程中会有三个状态，全部旧实例，新旧实例皆有，全部新实例。旧实例个数逐渐减少，新实例个数逐渐增加，最终达到旧实例个数为0，新实例个数达到理想的目标值。

kubernetes 滚动升级

Kubernetes 中采用ReplicaSet（简称RS）来管理Pod实例。如果当前集群中的Pod实例数少于目标值，RS 会拉起新的Pod，反之，则根据策略删除多余的Pod。Deployment正是利用了这样的特性，通过控制两个RS里面的Pod，从而实现升级。
滚动升级是一种平滑过渡式的升级，在升级过程中，服务仍然可用。这是kubernetes作为应用服务化管理的关键一步。服务无处不在，并且按需使用。这是云计算的初衷，对于PaaS平台来说，应用抽象成服务，遍布整个集群，为应用提供随时随地可用的服务是PaaS的终极使命。
1.ReplicaSet
关于RS的概念大家都很清楚了，我们来看看在k8s源码中的RS。

type ReplicaSetController struct {
    kubeClient clientset.Interface
    podControl controller.PodControlInterface

    // internalPodInformer is used to hold a personal informer.  If we're using
    // a normal shared informer, then the informer will be started for us.  If
    // we have a personal informer, we must start it ourselves.   If you start
    // the controller using NewReplicationManager(passing SharedInformer), this
    // will be null
    internalPodInformer framework.SharedIndexInformer

    // A ReplicaSet is temporarily suspended after creating/deleting these many replicas.
    // It resumes normal action after observing the watch events for them.
    burstReplicas int
    // To allow injection of syncReplicaSet for testing.
    syncHandler func(rsKey string) error

    // A TTLCache of pod creates/deletes each rc expects to see.
    expectations *controller.UIDTrackingControllerExpectations

    // A store of ReplicaSets, populated by the rsController
    rsStore cache.StoreToReplicaSetLister
    // Watches changes to all ReplicaSets
    rsController *framework.Controller
    // A store of pods, populated by the podController
    podStore cache.StoreToPodLister
    // Watches changes to all pods
    podController framework.ControllerInterface
    // podStoreSynced returns true if the pod store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    podStoreSynced func() bool

    lookupCache *controller.MatchingCache

    // Controllers that need to be synced
    queue *workqueue.Type

    // garbageCollectorEnabled denotes if the garbage collector is enabled. RC
    // manager behaves differently if GC is enabled.
    garbageCollectorEnabled bool
}

这个结构体位于pkg/controllers/replicaset,这里我们可以看出，RS最主要的几个对象，一个是针对Pod的操作对象-podControl.看到这个名字就知道，这个对象是控制RS下面的Pod的生命周期的，我们看看这个PodControl所包含的方法。

// PodControlInterface is an interface that knows how to add or delete pods
// created as an interface to allow testing.
type PodControlInterface interface {
    // CreatePods creates new pods according to the spec.
    CreatePods(namespace string, template *api.PodTemplateSpec, object runtime.Object) error
    // CreatePodsOnNode creates a new pod accorting to the spec on the specified node.
    CreatePodsOnNode(nodeName, namespace string, template *api.PodTemplateSpec, object runtime.Object) error
    // CreatePodsWithControllerRef creates new pods according to the spec, and sets object as the pod's controller.
    CreatePodsWithControllerRef(namespace string, template *api.PodTemplateSpec, object runtime.Object, controllerRef *api.OwnerReference) error
    // DeletePod deletes the pod identified by podID.
    DeletePod(namespace string, podID string, object runtime.Object) error
    // PatchPod patches the pod.
    PatchPod(namespace, name string, data []byte) error
}

这里我们可以看到，RS可以完全控制Pod.这里有两个watch，rsController和podController，他们分别负责watch ETCD中RS和Pod的变化。这里一个重要的对象不得不提，那就是syncHandler，这个是所有Controller都有的对象。每一个控制器通过Watch来监视ETCD中的变化，使用sync的方式来同步这些对象的状态，注意这个Handler只是一个委托，实际真正的Handler在创建控制器的时候指定。这种模式不仅仅适用于RS，其他控制器亦如此。
下面的逻辑更加清晰地说明了watch的逻辑。

rsc.rsStore.Store, rsc.rsController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return rsc.kubeClient.Extensions().ReplicaSets(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return rsc.kubeClient.Extensions().ReplicaSets(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.ReplicaSet{},
        // TODO: Can we have much longer period here?
        FullControllerResyncPeriod,
        framework.ResourceEventHandlerFuncs{
            AddFunc:    rsc.enqueueReplicaSet,
            UpdateFunc: rsc.updateRS,
            // This will enter the sync loop and no-op, because the replica set has been deleted from the store.
            // Note that deleting a replica set immediately after scaling it to 0 will not work. The recommended
            // way of achieving this is by performing a `stop` operation on the replica set.
            DeleteFunc: rsc.enqueueReplicaSet,
        },
    )

每次Watch到ETCD中的对象的变化，采取相应的措施，具体来说就是放入队列，更新或者取出队列。对于Pod来说，也有相应的处理。

podInformer.AddEventHandler(framework.ResourceEventHandlerFuncs{
        AddFunc: rsc.addPod,
        // This invokes the ReplicaSet for every pod change, eg: host assignment. Though this might seem like
        // overkill the most frequent pod update is status, and the associated ReplicaSet will only list from
        // local storage, so it should be ok.
        UpdateFunc: rsc.updatePod,
        DeleteFunc: rsc.deletePod,
    })

RS基本的内容就这些，在RS的上层是Deployment，这个对象也是一个控制器。

// DeploymentController is responsible for synchronizing Deployment objects stored
// in the system with actual running replica sets and pods.
type DeploymentController struct {
    client        clientset.Interface
    eventRecorder record.EventRecorder

    // To allow injection of syncDeployment for testing.
    syncHandler func(dKey string) error

    // A store of deployments, populated by the dController
    dStore cache.StoreToDeploymentLister
    // Watches changes to all deployments
    dController *framework.Controller
    // A store of ReplicaSets, populated by the rsController
    rsStore cache.StoreToReplicaSetLister
    // Watches changes to all ReplicaSets
    rsController *framework.Controller
    // A store of pods, populated by the podController
    podStore cache.StoreToPodLister
    // Watches changes to all pods
    podController *framework.Controller

    // dStoreSynced returns true if the Deployment store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    dStoreSynced func() bool
    // rsStoreSynced returns true if the ReplicaSet store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    rsStoreSynced func() bool
    // podStoreSynced returns true if the pod store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    podStoreSynced func() bool

    // Deployments that need to be synced
    queue workqueue.RateLimitingInterface
}

对于DeploymentController来说，需要监听Deployment,RS和Pod。从Controller的创建过程中可以看出来。

dc.dStore.Store, dc.dController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Extensions().Deployments(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Extensions().Deployments(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.Deployment{},
        FullDeploymentResyncPeriod,
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addDeploymentNotification,
            UpdateFunc: dc.updateDeploymentNotification,
            // This will enter the sync loop and no-op, because the deployment has been deleted from the store.
            DeleteFunc: dc.deleteDeploymentNotification,
        },
    )

    dc.rsStore.Store, dc.rsController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Extensions().ReplicaSets(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Extensions().ReplicaSets(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.ReplicaSet{},
        resyncPeriod(),
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addReplicaSet,
            UpdateFunc: dc.updateReplicaSet,
            DeleteFunc: dc.deleteReplicaSet,
        },
    )

    dc.podStore.Indexer, dc.podController = framework.NewIndexerInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Core().Pods(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Core().Pods(api.NamespaceAll).Watch(options)
            },
        },
        &api.Pod{},
        resyncPeriod(),
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addPod,
            UpdateFunc: dc.updatePod,
            DeleteFunc: dc.deletePod,
        },
        cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc},
    )

    dc.syncHandler = dc.syncDeployment
    dc.dStoreSynced = dc.dController.HasSynced
    dc.rsStoreSynced = dc.rsController.HasSynced
    dc.podStoreSynced = dc.podController.HasSynced

这里最核心的就是syncDeployment，因为这里面有rollingUpdate和rollback的实现。在这里如果watch到某个Deployment对象的RollbackTo.Revision部位nil,则执行rollingbach。这个Revision是版本号，注意虽然是回滚，但k8s内部记录的版本号永远是增长的。
有人会好奇,rollback是怎么做到的，其实原理很简单，k8s记录了各个版本的PodTemplate,把旧的PodTemplate覆盖新的Template即可。
对于K8S来说，升级有两种方式，一种是重新构建，一种是滚动升级。

switch d.Spec.Strategy.Type {
    case extensions.RecreateDeploymentStrategyType:
        return dc.rolloutRecreate(d)
    case extensions.RollingUpdateDeploymentStrategyType:
        return dc.rolloutRolling(d)
}

这个rolloutRolling里面包含了所有的秘密，这里我们可以看到。

func (dc *DeploymentController) rolloutRolling(deployment *extensions.Deployment) error {
    newRS, oldRSs, err := dc.getAllReplicaSetsAndSyncRevision(deployment, true)
    if err != nil {
        return err
    }
    allRSs := append(oldRSs, newRS)

    // Scale up, if we can.
    scaledUp, err := dc.reconcileNewReplicaSet(allRSs, newRS, deployment)
    if err != nil {
        return err
    }
    if scaledUp {
        // Update DeploymentStatus
        return dc.updateDeploymentStatus(allRSs, newRS, deployment)
    }

    // Scale down, if we can.
    scaledDown, err := dc.reconcileOldReplicaSets(allRSs, controller.FilterActiveReplicaSets(oldRSs), newRS, deployment)
    if err != nil {
        return err
    }
    if scaledDown {
        // Update DeploymentStatus
        return dc.updateDeploymentStatus(allRSs, newRS, deployment)
    }

    dc.cleanupDeployment(oldRSs, deployment)

    // Sync deployment status
    return dc.syncDeploymentStatus(allRSs, newRS, deployment)
}

这里做了如下几件事：
1. 查找新的RS和旧的RS，并计算出新的Revision（这是Revision的最大值）；
2. 对新的RS进行扩容操作；
3. 对旧的RS进行缩容操作；
4. 完成之后，删掉旧的RS；
5. 通过Deployment状态到etcd;

至此，我们知道了滚动升级在kubernetes中的原理。其实在传统的负载均衡应用中，滚动升��的做法很类似，但是在容器环境中，我们有RS，通过这种方法更为便捷。

Kubernetes 的详细介绍：请点这里
Kubernetes 的下载地址：请点这里

kubernetes pod

安科网

Kubernetes 滚动升级

shuixianmu

Kubernetes Rolling Upgrade

背景

kubernetes 滚动升级

shuixianmu

相关推荐

两款超好用的Kubernetes实时日志查看工具

Kubernetes上对应用程序进行故障排除的6个技巧

首次部署 Kubernetes 应用，总会忽略这些事

Kubernetes如何为应用程序提供网络和存储？

第7章：Kubernetes存储

6张图带你学懂 Kubernetes Ingress

推荐4款超好用本地Kubernetes部署工具

值得推荐的13个 Jenkins 替代方案

2020年非常值得推荐的7种 Kubernetes 日志管理工具

本地环境运行Kubernetes的4种开源工具

五款值得关注的Kubernetes日志监控工具

机器学习任务编排工具比较

使用Ansible的Kubernetes模块实现容器编排自动化

面试问到了K8S原理，花5分钟来总结下，以后再也不怕了

如何降低开发人员的生产力？

Windows环境下，如何在Docker里运行SAP UI5应用

解放开发者！3款工具实现快速K8S开发

什么是CaaS？简化容器管理

Linux基金会开源软件大学技术公开课丨K8s必备技能攻略

避免云锁定有哪几招？

shuixianmu