今天搭建kubernetes集群,使用calico作为CNI,但是发现pod一直没有初始化:
1 2 3 4 5 | ubuntu@perf-test-0:~/yiwei/performance_test$ kbctl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-676c4cbdf-zl6gr 1/1 Running 0 26s kube-system calico-node-bznhp 0/1 Running 0 28s kube-system calico-node-stwcs 0/1 Running 0 28s |
查看pod日志,发现两个节点之间没有建立BGP连接:
1 2 3 4 5 6 7 8 | ubuntu@perf-test-0:~/yiwei/performance_test$ kbctl describe pod calico-node-bznhp -n kube-system ================================= Normal Pulled 88s kubelet Container image "calico/node:v3.16.5" already present on machine Normal Created 88s kubelet Created container calico-node Normal Started 87s kubelet Started container calico-node Warning Unhealthy 84s kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused Warning Unhealthy 74s kubelet Readiness probe failed: 2020-11-15 08:24:34.851 [INFO][192] confd/health.go 180: Number of node(s) with BGP peering established = 0 calico/node is not ready: BIRD is not ready: BGP not established with 192.168.0.1 |
观察pod日志最后一行,结合集群中node的信息,发现了问题可能的原因:这个node上的calico pod要和192.168.0.1建立连接,而这个IP并没有对应的node!
1 2 3 4 | ubuntu@perf-test-0:~/yiwei/performance_test$ kbctl get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME perf-test-0 Ready master 17m v1.18.10 10.117.9.232 <none> Ubuntu 18.04.5 LTS 4.15.0-122-generic docker://19.3.13 perf-test-1 Ready <none> 7m54s v1.18.10 10.117.9.238 <none> Ubuntu 18.04.5 LTS 4.15.0-122-generic docker://19.3.13 |
搜索该错误,发现了这个GitHub Issue。于是按照如下步骤操作,解决了错误:
- 如下图所示,calico运行要求node之间能够通过一些特定的端口和协议连通,这在calico官方文档中也有提到。于是,按照如下命令在所有node当中设置Ubuntu防火墙,打开179端口:
sudo ufw allow 179/tcp
2. 因为BGP协议互相连接的IP不对,需要打开配置calico的YAML文件,并更改,自己加入IP_AUTODETECTION_METHOD这个环境变量,设置calico使用的IP Interface。可以根据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | containers: # Runs calico-node container on each Kubernetes node. This # container programs network policy and routes on each # host. - name: calico-node image: calico/node:v3.16.5 envFrom: - configMapRef: # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode. name: kubernetes-services-endpoint optional: true env: - name: IP_AUTODETECTION_METHOD value: interface=ens192 |
最后reset集群,重新apply上述YAML,成功部署了calico。