Kubernetes 高可用安装

2019-08-28

Kubernetes 高可用安装

利用HAProxy+keepalive+kubeadm来完成k8s的高可用搭建

主机名 IP地址 组件
K8s-master1-190 172.16.1.190 HAproxy,keepalive
K8s-master2-191 172.16.1.191 HAproxy,keepalive
K8s-master3-192 172.16.1.192
K8s-node1-194 172.16.1.194
Etcd1 172.16.1.193 etcd
Etcd2 172.16.1.195 etcd
Etcd3 172.16.1.198 etcd

1,安装前初始化

1.1 所有集群做秘钥分发

随便找台机器,做ansible管理节点,来进行秘钥批量分发

 yum -y install ansible
 #ssh第一次连接的时候一般会提示输入yes 进行确认为将key字符串加入到 ~/.ssh/known_hosts 文件中。而本机的~/.ssh/known_hosts文件中并有fingerprint key串
 set -i "s@^#host_key_checking = False@host_key_checking = False@g" /etc/ansible/ansible.cfg

1.1.1.1 编写ansible分发秘钥的yaml文件

[root@k8s-master1-190 ansible]# cat /etc/ansible/ssh.yaml
- hosts: k8s
  tasks:
    - name: k8s ssh copy key
      authorized_key: user=root key="{{lookup('file', '/root/.ssh/id_rsa.pub')}}"

    - name: yum install expect
      yum: name=expect state=latest

    - name: mkdir /scripts
      file: path=/scripts state=directory

    - name: copy scripts
      copy: src=/scripts/ssh-copy.sh dest=/scripts/ssh-copy.sh mode=0744

    - name: bash /scripts/ssh-copy.sh
      shell: /bin/bash /scripts/ssh-copy.sh

1.1.1.2 编写host 文件

[root@k8s-master1-190 ansible]# cat /etc/ansible/hosts
[k8s]
172.16.1.190
172.16.1.191
172.16.1.192
172.16.1.193
172.16.1.194
172.16.1.195
172.16.1.196
172.16.1.197
172.16.1.198
172.16.1.199



[k8s:vars]
ansible_ssh_user=root
ansible_ssh_pass=123456789

1.1.1.3 生成秘钥

[root@k8s-master1-190 ansible]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:o1i8H3QhX0bkMo/VeAY5XyCBh8AovmPbBpPTbMbwrQc root@k8s-master1-190
The key's randomart image is:
+---[RSA 2048]----+
|       o.. +*o.. |

▽
- hosts: k8s
|    . . . oo+=  .|
|   . .  . +.*o+. |
|    o.   o X o.  |
|     Oo.S + .    |
|    OoE+.o       |
|   ..Xoo.        |
|    . +...       |
|     . ..        |
+----[SHA256]-----+

1.1.1.4 创建批量分发秘钥的脚本

[root@k8s-master1-190 ansible]# cat /scripts/ssh-copy.sh
#!/usr/bin/bash
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
ssh-keygen -b 1024 -f /root/.ssh/id_rsa -t rsa -P ""
for i in `echo 190 191 192 193 194 195 196 197 198 199`
do
IP=172.16.1.$i
/usr/bin/expect <<EOF
    spawn ssh-copy-id -i /root/.ssh/id_rsa.pub root@$IP
    expect {
        "(yes/no)? " {send "yes\r";exp_continue}
        "password: " {send "123456789\r";}
    }
    expect eof
EOF
done

1.1.1.5 执行ansible的脚本

ansible-playbook /etc/ansible/ssh.yaml

1.2 性能调优

cat >> /etc/sysctl.conf<<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.neigh.default.gc_thresh1=4096
net.ipv4.neigh.default.gc_thresh2=6144
net.ipv4.neigh.default.gc_thresh3=8192
EOF
sysctl -p

2, 安装Etcd集群

Etcd 是 CoreOS 推出的高可用的键值存储系统,主要用于k8s集群的服务发现等,而本身 Etcd 也支持集群模式部署,从而实现自身高可用;
Etcd 构建自身高可用集群主要有三种形式:

静态发现: 预先已知 Etcd 集群中有哪些节点,在启动时直接指定好 Etcd 的各个 node 节点地址
Etcd 动态发现: 通过已有的 Etcd 集群作为数据交互点,然后在扩展新的集群时实现通过已有集群进行服务发现的机制
DNS 动态发现: 通过 DNS 查询方式获取其他节点地址信息

2.1, 静态发现搭建Etcd集群

2.1.1, 环境准备

在下面3台机器安装Etcd集群,(注意,Etcd集群节点必须为奇数)

节点/主机名 IP地址
Etcd1 172.16.1.193
Etcd2 172.16.1.195
Etcd3 172.16.1.198

2.1.2 安装Etcd

CentOS 官方提供了 Etcd 的rpm,可通过 yum 直接安装,目前 yum 上最新版本为 3.3.11,比较合适;官方最新版本更新到了 3.3.12,鉴于稳定因素,这里使用 3.3.11 搭建

# yum -y install etcd
# etcd --version
etcd Version: 3.3.11
Git SHA: 2cf9e51
Go Version: go1.10.3
Go OS/Arch: linux/amd64

2.1.3 修改配置文件

默认的配置文件

[root@Etcd2 /]# cat /etc/etcd/etcd.conf
#[Member]
#ETCD_CORS=""
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
#ETCD_LISTEN_PEER_URLS="http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379"
#ETCD_MAX_SNAPSHOTS="5"
#ETCD_MAX_WALS="5"
ETCD_NAME="default"
#ETCD_SNAPSHOT_COUNT="100000"
#ETCD_HEARTBEAT_INTERVAL="100"
#ETCD_ELECTION_TIMEOUT="1000"
#ETCD_QUOTA_BACKEND_BYTES="0"
#ETCD_MAX_REQUEST_BYTES="1572864"
#ETCD_GRPC_KEEPALIVE_MIN_TIME="5s"
#ETCD_GRPC_KEEPALIVE_INTERVAL="2h0m0s"
#ETCD_GRPC_KEEPALIVE_TIMEOUT="20s"
#
#[Clustering]
#ETCD_INITIAL_ADVERTISE_PEER_URLS="http://localhost:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_DISCOVERY_SRV=""
#ETCD_INITIAL_CLUSTER="default=http://localhost:2380"
#ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
#ETCD_INITIAL_CLUSTER_STATE="new"
#ETCD_STRICT_RECONFIG_CHECK="true"
#ETCD_ENABLE_V2="true"
#
#[Proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[Security]
#ETCD_CERT_FILE=""
#ETCD_KEY_FILE=""
#ETCD_CLIENT_CERT_AUTH="false"
#ETCD_TRUSTED_CA_FILE=""
#ETCD_AUTO_TLS="false"
#ETCD_PEER_CERT_FILE=""
#ETCD_PEER_KEY_FILE=""
#ETCD_PEER_CLIENT_CERT_AUTH="false"
#ETCD_PEER_TRUSTED_CA_FILE=""
#ETCD_PEER_AUTO_TLS="false"
#
#[Logging]
#ETCD_DEBUG="false"
#ETCD_LOG_PACKAGE_LEVELS=""
#ETCD_LOG_OUTPUT="default"
#
#[Unsafe]
#ETCD_FORCE_NEW_CLUSTER="false"
#
#[Version]
#ETCD_VERSION="false"
#ETCD_AUTO_COMPACTION_RETENTION="0"
#
#[Profiling]
#ETCD_ENABLE_PPROF="false"
#ETCD_METRICS="basic"
#
#[Auth]
#ETCD_AUTH_TOKEN="simple"

修改后的配置文件,另外两台对照更改

[root@Etcd1 /]# grep -v "^#" /etc/etcd/etcd.conf
#数据存放的目录
ETCD_DATA_DIR="/var/lib/etcd/etcd1"
#监听的URL,用于与其他节点通讯
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
#对外提供服务的地址,客户端会连接到这里和Etcd进行交互
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
#节点的名称
ETCD_NAME="etcd1"
#改节点的member(同伴)监听的地址,这个值会告诉集群中其他节点
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.1.193:2380"
#对外公告的该节点客户端监听地址,这个值会告诉集群中其他节点
ETCD_ADVERTISE_CLIENT_URLS="http://172.16.1.193:2379"
#集群中所有节点的信息,这里的 etcd1 是节点的 --name 指定的名字;后面的172.16.1.93:2380 是 --initial-advertise-peer-urls 指定的值
ETCD_INITIAL_CLUSTER="etcd1=http://172.16.1.193:2380,etcd2=http://172.16.1.195:2380,etcd3=http://172.16.1.198:2380"
# 创建集群的 token,这个值每个集群保持唯一。这样的话,如果你要重新创建集群,即使配置和之前一样,也会再次生成新的集群和节点 uuid;否则会导致多个集群之间的冲突,造成未知的错误
ETCD_INITIAL_CLUSTER_TOKEN="zsf-etcd-cluster"
#新建集群的时候,这个值为 new ;假如已经存在的集群,这个值为 existing
ETCD_INITIAL_CLUSTER_STATE="new"

2.1.4 启动etcd集群,并查看其状态

启动etcd集群

systemctl start etcd.service

集群启动完成之后我们在任意节点执行etcdctl member list 可列出所有集群节点信息,

[root@Etcd1 /]# etcdctl member list
729a9a39e059871b: name=etcd2 peerURLs=http://172.16.1.195:2380 clientURLs=http://172.16.1.195:2379 isLeader=false
ce1ac55777b620f9: name=etcd1 peerURLs=http://172.16.1.193:2380 clientURLs=http://172.16.1.193:2379 isLeader=true
e62232af7400cdbe: name=etcd3 peerURLs=http://172.16.1.198:2380 clientURLs=http://172.16.1.198:2379 isLeader=false

使用etcdctl cluster-health检查集群健康状态

[root@Etcd1 /]# etcdctl cluster-health
member 729a9a39e059871b is healthy: got healthy result from http://172.16.1.195:2379
member ce1ac55777b620f9 is healthy: got healthy result from http://172.16.1.193:2379
member e62232af7400cdbe is healthy: got healthy result from http://172.16.1.198:2379

3 安装HAPorxy 和keepalive

3.1 部署keepalive

3.1.1 安装keepalive

master1安装keepalive

[root@k8s-master1-190 /]# yum -y install epel-re*
[root@k8s-master1-190 /]# yum -y install keepalived.x86_64
[root@k8s-master1-190 /]# cat > /etc/keepalived/keepalived.conf <<-'EOF'
! Configuration File for keepalived
global_defs {
   router_id k8s-master1-190
}
vrrp_instance VI_1 {
    state MASTER
    interface ens192
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass zsf
    }
    virtual_ipaddress {
        172.16.1.89
    }
}
EOF
systemctl start keepalived.service && systemctl enable keepalived.service

master2安装keepalive

[root@k8s-master2-191 /]# yum -y install epel-re*
[root@k8s-master2-191 /]# yum -y install keepalived.x86_64
[root@k8s-master2-191 /]# cat > /etc/keepalived/keepalived.conf <<-'EOF'
! Configuration File for keepalived
global_defs {
   router_id k8s-master2-191
}
vrrp_instance VI_1 {
    state MASTER
    interface ens192
    virtual_router_id 51
    priority 50
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass zsf
    }
    virtual_ipaddress {
        172.16.1.89
    }
}
EOF
systemctl start keepalived.service && systemctl enable keepalived.service

检查Vip是否创建成功
到keepalive的master1上查看IP地址

[root@k8s-master1-190 /]# ip a | grep "172.16.1.89"
    inet 172.16.1.89/32 scope global ens192

3.2 部署安装HAproxy

master1 上安装HAProxy

[root@k8s-master1-190 /]# yum -y install haproxy.x86_64
[root@k8s-master1-190 /]# cat > /etc/haproxy/haproxy.cfg <<-'EOF'
global
        chroot  /var/lib/haproxy
        daemon
        group haproxy
        user haproxy
        log 127.0.0.1:514 local0 warning
        pidfile /var/lib/haproxy.pid
        maxconn 20000
        spread-checks 3
        nbproc 8
defaults
        log     global
        mode    tcp
        retries 3
        option redispatch
listen https-apiserver
        bind 192.168.1.89:8443
        mode tcp
        balance roundrobin
        timeout server 15s
        timeout connect 15s
        server apiserver01 172.16.1.190:6443 check port 6443 inter 5000 fall 5
        server apiserver02 172.16.1.191:6443 check port 6443 inter 5000 fall 5
        server apiserver03 172.16.1.192:6443 check port 6443 inter 5000 fall 5
EOF
[root@k8s-master1-190 /]# systemctl start  haproxy.service && systemctl enable haproxy.service

master2 上安装HAProxy

[root@k8s-master2-191 /]# yum -y install haproxy
[root@k8s-master2-191 /]# cat > /etc/haproxy/haproxy.cfg <<-'EOF'
global
        chroot  /var/lib/haproxy
        daemon
        group haproxy
        user haproxy
        log 127.0.0.1:514 local0 warning
        pidfile /var/lib/haproxy.pid
        maxconn 20000
        spread-checks 3
        nbproc 8
defaults
        log     global
        mode    tcp
        retries 3
        option redispatch
listen https-apiserver
        bind 192.168.1.89:8443
        mode tcp
        balance roundrobin
        timeout server 15s
        timeout connect 15s
        server apiserver01 172.16.1.190:6443 check port 6443 inter 5000 fall 5
        server apiserver02 172.16.1.191:6443 check port 6443 inter 5000 fall 5
        server apiserver03 172.16.1.192:6443 check port 6443 inter 5000 fall 5
EOF
[root@k8s-master2-191 /]# systemctl start  haproxy.service && systemctl enable haproxy.service

4, 安装部署kubernetes 1.15.0

4.1 安装k8s源

三台master都需要执行

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

4.2, 安装docker

所有节点都需要执行

yum install -y yum-utils device-mapper-persistent-data lvm2
wget -O /etc/yum.repos.d/docker-ce.repo https://download.docker.com/linux/centos/docker-ce.repo
yum makecache fast
yum -y install docker-ce
systemctl enable docker && systemctl start docker

4.3, 安装k8s组件

三台master都需要执行

yum -y install kubectl-1.15.0 kubelet-1.15.0  kubeadm-1.15.0

4.4, 配置kubelet使用的cgroup驱动程序(全部主机都安装后设置)

echo 'Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"' >> /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

4.5, 配置kubelet服务特权权限(全部主机都安装后设置) 要不后面部署heapster会报错

echo 'Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false"' >> /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

4.6, 设置docker镜像加速(所有主机)

cat > /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://tj7mo5wf.mirror.aliyuncs.com"]
}
EOF

4.7, 创建kubeadm的yaml文件

[root@k8s-master1-190 ~]# cat > kubeadm-config.yaml <<-'EOF'
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.15.0
controlPlaneEndpoint: 172.16.1.89:8443
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
networking:
  podSubnet: 10.10.0.0/16
etcd:
    external:
        endpoints:
        - http://172.16.1.193:2379
        - http://172.16.1.195:2379
        - http://172.16.1.198:2379
EOF

4.8, 拉去需要安装的镜像

[root@k8s-master1-190 ~]# kubeadm config images pull --config kubeadm-config.yaml
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.15.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.15.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.15.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.15.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1

4.9 执行部署命令

[root@k8s-master1-190 ~]# kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs

安装成功后会看到如下

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

#添加证书,只有添加证书之后才能操作集群
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:
# 将master节点加入集群中
  kubeadm join 172.16.1.189:8443 --token zvbtcb.8p2akd4drz5sofog \
    --discovery-token-ca-cert-hash sha256:e2813c1d67fe6d9471cdc881cee5c3e0264557089d1b281b79749fba50d8badc \
    --experimental-control-plane --certificate-key 69acad3da9ddacb9734aa5607448e2fbc49627bc4073386fb04d785f9317b923

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

# 将node加入到集群中,tokne有效时间为24小时,过期了可以需要重新生成
kubeadm join 172.16.1.189:8443 --token zvbtcb.8p2akd4drz5sofog \
    --discovery-token-ca-cert-hash sha256:e2813c1d67fe6d9471cdc881cee5c3e0264557089d1b281b79749fba50d8badc

4.9.1, 创建操作集群的密钥

按照提示操作,在操作kubectl的用户家目录下创建密钥

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

4.9.2, 查看当前k8s的节点

[root@k8s-master1-190 ~]# kubectl get nodes -o wide
NAME              STATUS     ROLES    AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
k8s-master1-190   NotReady   master   80s   v1.15.0   172.16.1.190   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://17.3.2

此时有一台了,且状态为"NotReady"

4.9.3, 查看当前启动的pod

[root@k8s-master1-190 ~]#  kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE   IP             NODE              NOMINATED NODE   READINESS GATES
kube-system   coredns-6967fb4995-dxxjp                  0/1     Pending   0          77s   <none>         <none>            <none>           <none>
kube-system   coredns-6967fb4995-hh62r                  0/1     Pending   0          77s   <none>         <none>            <none>           <none>
kube-system   kube-apiserver-k8s-master1-190            1/1     Running   0          26s   172.16.1.190   k8s-master1-190   <none>           <none>
kube-system   kube-controller-manager-k8s-master1-190   1/1     Running   0          33s   172.16.1.190   k8s-master1-190   <none>           <none>
kube-system   kube-proxy-gjjgx                          1/1     Running   0          77s   172.16.1.190   k8s-master1-190   <none>           <none>
kube-system   kube-scheduler-k8s-master1-190            1/1     Running   0          31s   172.16.1.190   k8s-master1-190   <none>           <none>

因为我们没有网络插件,所以Croedns处于 Pending

4.9.4, 安装flannel网络插件

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

4.9.5, 向集群中加入新的master节点

另外两台一master的身份加入集群

拷贝证书到新添加的master节点上

scp -r /etc/kubernetes/pki/ root@172.16.1.191:/etc/kubernetes/
scp -r /etc/kubernetes/pki/ root@172.16.1.192:/etc/kubernetes/

4.9.6, 向集群中加入node节点

kubeadm join 172.16.1.189:8443 --token zvbtcb.8p2akd4drz5sofog \
    --discovery-token-ca-cert-hash sha256:e2813c1d67fe6d9471cdc881cee5c3e0264557089d1b281b79749fba50d8badc


标题:Kubernetes 高可用安装
作者:shoufuzhang
地址:https://www.zhangshoufu.com/articles/2019/08/28/1567004212871.html
名言:The master has failed more times than the beginner has tried.
评论
发表评论