起因:

升级集群从1.15.4->1.16.8过程当中发现再default命名空间下的kubernetes服务出现随机性不可访问的问题,错误提示 Failed connect to 10.254.0.1:443; Connection refused

该问题导致有些通过集群kubernetes.default的服务无法与apiserver进行同行导致无法启动,如官方提供的fluentd daemonset 启动报错:

2021-07-08 01:32:55 +0000 [error]: config error file="/etc/fluent/fluent.conf" error_class=Fluent::ConfigError error="start_namespace_watch: Exception encountered setting up namespace watch from Kubernetes API v 1 endpoint https://10.254.0.1:443/api: Failed to open TCP connection to 10.254.0.1:443 (Connection refused - connect(2) for \"10.254.0.1\" port 443)"

经过验证访问错误是随机产生的。

错误现象复现:

 1[root@centos-deployment-6c4fc8fcc-7l86j /]# curl  -k https://10.254.0.1:443
 2curl: (7) Failed connect to 10.254.0.1:443; Connection refused
 3
 4[root@centos-deployment-6c4fc8fcc-7l86j /]# curl  -k https://10.254.0.1:443
 5{
 6  "kind": "Status",
 7  "apiVersion": "v1",
 8  "metadata": {
 9
10  },
11  "status": "Failure",
12  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
13  "reason": "Forbidden",
14  "details": {
15
16  },
17  "code": 403

经过一系列排查,是集群的3个master节点有一个服务没有启动导致的,

沉思:

讲道理来说 kubernetes.defualt 该服务是一个由集群启动的时候就会默认注册到default命名空间的集群级别服务,它通常是默认的 ServiceAccount 用于与Apiserver进行通信的端口,此外该内置的 kubernetes.default service 无法删除,其 ClusterIP 为通常为 --service-cluster-ip-range 参数指定的 ip 段中的首个 ip,kubernetes.default 的endpoints 中的 ip 以及 port 可以通过各个kube-apiserver 的 --advertise-address--secure-port 启动参数来指定。

那么如果集群当中的某个master节点如果一旦挂掉,对应的 kubernetes.default service所对应的服务后端(endpoint)也应该会消失,可是在该集群上的结果如下:

1[root@hz-k8s-master-199-152-51 ~]# kubectl get ep kubernetes
2NAME                                                     ENDPOINTS                                               AGE
3kubernetes                                               10.199.152.51:443,10.199.152.52:443,10.199.152.53:443   630d

其中10.199.152.53的节点是本次手动关闭的节点,但是依然存在,集群并未将改无用的端点取消。从而导致了上面 Connection refused 的问题。

分析:

但是在其他的通过二进制部署的kubernetes该服务并未出现此类情况,当其他apiserver挂了或者关闭之后 endpoint 会自动消失,对比了apiserver的配置文件最终锁定在apiserver的 --endpoint-reconciler-type=master-count 参数上,该参数有3个可选值master-count, lease, none 默认为 lease

当时修改该值的原因是官方的配置参考 https://kubernetes.io/zh/docs/reference/command-line-tools-reference/kube-apiserver/ 提到:

1--apiserver-count int     默认值:1
2集群中运行的 API 服务器数量,必须为正数。 (在启用 --endpoint-reconciler-type=master-count 时使用。)

我的kubernetes集群是3节点的master,并且--apiserver-count=3 所以启用了--endpoint-reconciler-type=master-count 该配置

进一步看1.16版本和新版本的源码,该配置主要是针对EndpointReconcilerType的一个具体实现

 1/*
 2Copyright 2017 The Kubernetes Authors.
 3Licensed under the Apache License, Version 2.0 (the "License");
 4you may not use this file except in compliance with the License.
 5You may obtain a copy of the License at
 6    http://www.apache.org/licenses/LICENSE-2.0
 7Unless required by applicable law or agreed to in writing, software
 8distributed under the License is distributed on an "AS IS" BASIS,
 9WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10See the License for the specific language governing permissions and
11limitations under the License.
12*/
13
14// Package reconcilers Endpoint Reconcilers for the apiserver
15package reconcilers
16
17import (
18	"net"
19
20	corev1 "k8s.io/api/core/v1"
21)
22
23// EndpointReconciler knows how to reconcile the endpoints for the apiserver service.
24type EndpointReconciler interface {
25	// ReconcileEndpoints sets the endpoints for the given apiserver service (ro or rw).
26	// ReconcileEndpoints expects that the endpoints objects it manages will all be
27	// managed only by ReconcileEndpoints; therefore, to understand this, you need only
28	// understand the requirements.
29	//
30	// Requirements:
31	//  * All apiservers MUST use the same ports for their {rw, ro} services.
32	//  * All apiservers MUST use ReconcileEndpoints and only ReconcileEndpoints to manage the
33	//      endpoints for their {rw, ro} services.
34	//  * ReconcileEndpoints is called periodically from all apiservers.
35	ReconcileEndpoints(serviceName string, ip net.IP, endpointPorts []corev1.EndpointPort, reconcilePorts bool) error
36	// RemoveEndpoints removes this apiserver's lease.
37	RemoveEndpoints(serviceName string, ip net.IP, endpointPorts []corev1.EndpointPort) error
38	// StopReconciling turns any later ReconcileEndpoints call into a noop.
39	StopReconciling()
40}
41
42// Type the reconciler type
43type Type string
44
45const (
46	// MasterCountReconcilerType will select the original reconciler
47	MasterCountReconcilerType Type = "master-count"
48	// LeaseEndpointReconcilerType will select a storage based reconciler
49	LeaseEndpointReconcilerType = "lease"
50	// NoneEndpointReconcilerType will turn off the endpoint reconciler
51	NoneEndpointReconcilerType = "none"
52)
53
54// Types an array of reconciler types
55type Types []Type
56
57// AllTypes export all reconcilers
58var AllTypes = Types{
59	MasterCountReconcilerType,
60	LeaseEndpointReconcilerType,
61	NoneEndpointReconcilerType,
62}
63
64// Names returns a slice of all the reconciler names
65func (t Types) Names() []string {
66	strs := make([]string, len(t))
67	for i, v := range t {
68		strs[i] = string(v)
69	}
70	return strs
71}

PS: 1.16.15关于reconcilers的代码位置

https://github.com/kubernetes/kubernetes/tree/v1.16.15/pkg/master/reconcilers

PS:1.21.2当中关于reconcilers的代码位置

https://github.com/kubernetes/kubernetes/tree/v1.21.2/pkg/controlplane/reconcilers

查看reconcilers.go 可以发现,对于master-count 的类型最新版本与1.16版本均存在该类型,但是源码片段在1.21.2当中已经不存在,

初步猜想是针对master-count 类型的endpoint-reconcile 存在问题,并且也不像 官方参考的那样kube-apiserver 配置 --endpoint-reconciler-type=master-count 才启用--apiserver-count= 的选主功能。

总结:

在使用高可用集群的时候关于kube-apiserver 多主的情况下不要配置--endpoint-reconciler-type=master-count,转而使用默认的--endpoint-reconciler-type=lease (默认就是lease模式)

如果配置成为master-acount的模式,那么kube-apiserver挂掉之后Kubernetes Endpoints不更新,导致访问kubernetes.default该默认服务无法起到高可用的情况。

因而需要配置为–endpoint-reconciler-type=lease的模式,此时如果apiserver挂掉了会自动将endpoint信息同步更新掉。