Kubernetesでpodが起動しない_CrashLoopBackOff

もう少しで完成しそうなのにうまくいかない

ccie-go.com

このURLを参考にKubernetes自宅に作っていました。(このサイトなかったらkubernetesやろうと思わなかったです。大変ありがとうございます。m(__)m) 最後のbootcampのところでなぜかRunningにならず、、

[root@master ~]# kubectl get pod -o wide
NAME                                   READY   STATUS              RESTARTS   AGE     IP       NODE    NOMINATED NODE
kubernetes-bootcamp-598f57b95c-5mg5g   0/1     ContainerCreating   0          7d21h      node2   
[root@master ~]#

でbootcampが動いているnodeのflannelの動作がおかしい…というところまでは突き止めました。

[root@master ~]# kubectl get pod -n kube-system
NAME                             READY   STATUS             RESTARTS   AGE
coredns-576cbf47c7-59nqg         1/1     Running            0          48m
coredns-576cbf47c7-95hv2         1/1     Running            0          48m
etcd-master                      1/1     Running            0          41m
kube-apiserver-master            1/1     Running            0          41m
kube-controller-manager-master   1/1     Running            0          41m
kube-flannel-ds-gx9jc            1/1     Running            0          39m
kube-flannel-ds-hzvs2            1/1     Running            0          39m
kube-flannel-ds-k9gnc            0/1     CrashLoopBackOff   7          40m
kube-flannel-ds-l9wbd            1/1     Running            0          41m
kube-proxy-4csbj                 1/1     Running            0          40m
kube-proxy-gfr8w                 1/1     Running            0          48m
kube-proxy-lzqxw                 1/1     Running            0          39m
kube-proxy-nsfvf                 1/1     Running            0          39m
kube-scheduler-master            1/1     Running            0          41m
[root@master ~]#
[root@master ~]#

CrashLoopBackOffというのが起きている。。対応するflannelの名前を確認。(kube-flannel-ds-k9gnc

kubectl describe nodesでどのノードと紐づいているか確認。(以下kubectl describe nodesコマンド抜粋)

Addresses:
  InternalIP:  192.168.52.82
  Hostname:    node2
Capacity:
 cpu:                1
 ephemeral-storage:  8178Mi
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             1958400Ki
 pods:               110
Allocatable:
 cpu:                1
 ephemeral-storage:  7717729063
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             1856000Ki
 pods:               110
System Info:
 Machine ID:                 1a22a7e8e4d64bc2b854d23256618274
 System UUID:                564D51B1-33E9-8F98-0C5A-E7288C59C38C
 Boot ID:                    bed998a1-0378-46c8-b068-e5562b499bb0
 Kernel Version:             3.10.0-957.el7.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.6.0
 Kubelet Version:            v1.12.3
 Kube-Proxy Version:         v1.12.3
PodCIDR:                     10.244.1.0/24
Non-terminated Pods:         (2 in total)
  Namespace                  Name                     CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                     ------------  ----------  ---------------  -------------
  kube-system                kube-flannel-ds-k9gnc    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-proxy-4csbj         0 (0%)        0 (0%)      0 (0%)           0 (0%)
Allocated resources:

同じコマンドkubectl describe nodesのnode2に対するエラーの内容を見てもよくわからなかったです。

Name:               node2
Roles:              
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=node2
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 03 Apr 2019 23:20:01 -0400
Taints:             node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Wed, 03 Apr 2019 23:32:04 -0400   Wed, 03 Apr 2019 23:20:01 -0400   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Wed, 03 Apr 2019 23:32:04 -0400   Wed, 03 Apr 2019 23:20:01 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 03 Apr 2019 23:32:04 -0400   Wed, 03 Apr 2019 23:20:01 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 03 Apr 2019 23:32:04 -0400   Wed, 03 Apr 2019 23:20:01 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Wed, 03 Apr 2019 23:32:04 -0400   Wed, 03 Apr 2019 23:20:01 -0400   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

node2でkubeadm resetしたり、masterでkubeadm initしたりしましたが、解消せず、、 検索して出てきた結果flannelまわりがおかしいっぽいのはわかるのですが、不明。。

なんかひらめいた

kube-flannel.ymlの command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=ens160" ]あたりがあやしい。

node2だけ、NICのインターフェースがens192になってしまっていた。

  • node2
[root@node2 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens192:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:59:c3:8c brd ff:ff:ff:ff:ff:ff
    inet 192.168.52.82/24 brd 192.168.52.255 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe59:c38c/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
  • node1
[root@node1 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens160:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:88:68:cd brd ff:ff:ff:ff:ff:ff
    inet 192.168.52.81/24 brd 192.168.52.255 scope global noprefixroute ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe88:68cd/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

node2で負荷高いのが原因か!とか思って再起動とかしてしまっていてNICの名前が変わってしまったみたいです。(下図の192.168.52.82でCPUが49%になっているところ。49%でビビッて再起動してしまった。) f:id:TKCman:20190405172851p:plain

nmtuiでNICの名前をens192からもとのens160に書き換えf:id:TKCman:20190405163339p:plain

別の名前とIPでNICが上がってきてしまったため、断念。 色々やってみたのですが、うまくens160にならなかったです。node2のVMのスナップショットでyumでk8s設定入れる前まで戻しました、、

うまくいくかはわからないがcommand: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=ens160", "--iface=ens192" ] で試してみました。

[root@master ~]#
[root@master ~]# kubectl get pod -n kube-system
NAME                             READY   STATUS    RESTARTS   AGE
coredns-576cbf47c7-hq4k9         1/1     Running   0          4m38s
coredns-576cbf47c7-xh8h8         1/1     Running   0          4m38s
etcd-master                      1/1     Running   0          3m41s
kube-apiserver-master            1/1     Running   0          3m51s
kube-controller-manager-master   1/1     Running   0          3m37s
kube-flannel-ds-9cdlv            1/1     Running   0          58s
kube-proxy-29b6k                 1/1     Running   0          4m38s
kube-scheduler-master            1/1     Running   0          3m51s
[root@master ~]#
[root@master ~]#
[root@master ~]#
[root@master ~]# kubectl get pod -n kube-system
NAME                             READY   STATUS    RESTARTS   AGE
coredns-576cbf47c7-hq4k9         1/1     Running   0          5m41s
coredns-576cbf47c7-xh8h8         1/1     Running   0          5m41s
etcd-master                      1/1     Running   0          4m44s
kube-apiserver-master            1/1     Running   0          4m54s
kube-controller-manager-master   1/1     Running   0          4m40s
kube-flannel-ds-9cdlv            1/1     Running   0          2m1s
kube-flannel-ds-jbnrs            1/1     Running   0          8s
kube-proxy-29b6k                 1/1     Running   0          5m41s
kube-proxy-vdnmv                 1/1     Running   0          8s
kube-scheduler-master            1/1     Running   0          4m54s
[root@master ~]#
[root@master ~]#
[root@master ~]#
[root@master ~]# kubectl get pod -n kube-system
NAME                             READY   STATUS              RESTARTS   AGE
coredns-576cbf47c7-hq4k9         1/1     Running             0          6m28s
coredns-576cbf47c7-xh8h8         1/1     Running             0          6m28s
etcd-master                      1/1     Running             0          5m31s
kube-apiserver-master            1/1     Running             0          5m41s
kube-controller-manager-master   1/1     Running             0          5m27s
kube-flannel-ds-8vt88            0/1     Init:0/1            0          8s
kube-flannel-ds-9cdlv            1/1     Running             0          2m48s
kube-flannel-ds-jbnrs            1/1     Running             0          55s
kube-flannel-ds-kdkww            1/1     Running             0          19s
kube-proxy-29b6k                 1/1     Running             0          6m28s
kube-proxy-4m7ks                 0/1     ContainerCreating   0          8s
kube-proxy-ktklf                 1/1     Running             0          19s
kube-proxy-vdnmv                 1/1     Running             0          55s
kube-scheduler-master            1/1     Running             0          5m41s
[root@master ~]#
[root@master ~]#
[root@master ~]# kubectl get pod -n kube-system
NAME                             READY   STATUS     RESTARTS   AGE
coredns-576cbf47c7-hq4k9         1/1     Running    0          6m35s
coredns-576cbf47c7-xh8h8         1/1     Running    0          6m35s
etcd-master                      1/1     Running    0          5m38s
kube-apiserver-master            1/1     Running    0          5m48s
kube-controller-manager-master   1/1     Running    0          5m34s
kube-flannel-ds-8vt88            0/1     Init:0/1   0          15s
kube-flannel-ds-9cdlv            1/1     Running    0          2m55s
kube-flannel-ds-jbnrs            1/1     Running    0          62s
kube-flannel-ds-kdkww            1/1     Running    0          26s
kube-proxy-29b6k                 1/1     Running    0          6m35s
kube-proxy-4m7ks                 1/1     Running    0          15s
kube-proxy-ktklf                 1/1     Running    0          26s
kube-proxy-vdnmv                 1/1     Running    0          62s
kube-scheduler-master            1/1     Running    0          5m48s
[root@master ~]#

kubectl joinで無事上がってきました!!(赤字のflannel) 続きます。