Kubernetesでpodが起動しない_CrashLoopBackOff
もう少しで完成しそうなのにうまくいかない
このURLを参考にKubernetes自宅に作っていました。(このサイトなかったらkubernetesやろうと思わなかったです。大変ありがとうございます。m(__)m) 最後のbootcampのところでなぜかRunningにならず、、
[root@master ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kubernetes-bootcamp-598f57b95c-5mg5g 0/1 ContainerCreating 0 7d21hnode2 [root@master ~]#
でbootcampが動いているnodeのflannelの動作がおかしい…というところまでは突き止めました。
[root@master ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-59nqg 1/1 Running 0 48m
coredns-576cbf47c7-95hv2 1/1 Running 0 48m
etcd-master 1/1 Running 0 41m
kube-apiserver-master 1/1 Running 0 41m
kube-controller-manager-master 1/1 Running 0 41m
kube-flannel-ds-gx9jc 1/1 Running 0 39m
kube-flannel-ds-hzvs2 1/1 Running 0 39m
kube-flannel-ds-k9gnc 0/1 CrashLoopBackOff 7 40m
kube-flannel-ds-l9wbd 1/1 Running 0 41m
kube-proxy-4csbj 1/1 Running 0 40m
kube-proxy-gfr8w 1/1 Running 0 48m
kube-proxy-lzqxw 1/1 Running 0 39m
kube-proxy-nsfvf 1/1 Running 0 39m
kube-scheduler-master 1/1 Running 0 41m
[root@master ~]#
[root@master ~]#
CrashLoopBackOffというのが起きている。。対応するflannelの名前を確認。(kube-flannel-ds-k9gnc)
kubectl describe nodesでどのノードと紐づいているか確認。(以下kubectl describe nodesコマンド抜粋)
Addresses: InternalIP: 192.168.52.82 Hostname: node2 Capacity: cpu: 1 ephemeral-storage: 8178Mi hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 1958400Ki pods: 110 Allocatable: cpu: 1 ephemeral-storage: 7717729063 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 1856000Ki pods: 110 System Info: Machine ID: 1a22a7e8e4d64bc2b854d23256618274 System UUID: 564D51B1-33E9-8F98-0C5A-E7288C59C38C Boot ID: bed998a1-0378-46c8-b068-e5562b499bb0 Kernel Version: 3.10.0-957.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://17.6.0 Kubelet Version: v1.12.3 Kube-Proxy Version: v1.12.3 PodCIDR: 10.244.1.0/24 Non-terminated Pods: (2 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits --------- ---- ------------ ---------- --------------- ------------- kube-system kube-flannel-ds-k9gnc 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-proxy-4csbj 0 (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources:
同じコマンドkubectl describe nodesのnode2に対するエラーの内容を見てもよくわからなかったです。
Name: node2 Roles:Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=node2 Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 03 Apr 2019 23:20:01 -0400 Taints: node.kubernetes.io/not-ready:NoSchedule Unschedulable: false Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- OutOfDisk False Wed, 03 Apr 2019 23:32:04 -0400 Wed, 03 Apr 2019 23:20:01 -0400 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Wed, 03 Apr 2019 23:32:04 -0400 Wed, 03 Apr 2019 23:20:01 -0400 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 03 Apr 2019 23:32:04 -0400 Wed, 03 Apr 2019 23:20:01 -0400 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 03 Apr 2019 23:32:04 -0400 Wed, 03 Apr 2019 23:20:01 -0400 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Wed, 03 Apr 2019 23:32:04 -0400 Wed, 03 Apr 2019 23:20:01 -0400 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
node2でkubeadm resetしたり、masterでkubeadm initしたりしましたが、解消せず、、 検索して出てきた結果flannelまわりがおかしいっぽいのはわかるのですが、不明。。
なんかひらめいた
kube-flannel.ymlの
command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=ens160" ]
あたりがあやしい。
node2だけ、NICのインターフェースがens192になってしまっていた。
- node2
[root@node2 ~]# ip a 1: lo:mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens192: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:59:c3:8c brd ff:ff:ff:ff:ff:ff inet 192.168.52.82/24 brd 192.168.52.255 scope global noprefixroute ens192 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe59:c38c/64 scope link noprefixroute valid_lft forever preferred_lft forever
- node1
[root@node1 ~]# ip a 1: lo:mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:88:68:cd brd ff:ff:ff:ff:ff:ff inet 192.168.52.81/24 brd 192.168.52.255 scope global noprefixroute ens160 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe88:68cd/64 scope link noprefixroute valid_lft forever preferred_lft forever
node2で負荷高いのが原因か!とか思って再起動とかしてしまっていてNICの名前が変わってしまったみたいです。(下図の192.168.52.82でCPUが49%になっているところ。49%でビビッて再起動してしまった。)
nmtuiでNICの名前をens192からもとのens160に書き換え
別の名前とIPでNICが上がってきてしまったため、断念。 色々やってみたのですが、うまくens160にならなかったです。node2のVMのスナップショットでyumでk8s設定入れる前まで戻しました、、
うまくいくかはわからないがcommand: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=ens160", "--iface=ens192" ]
で試してみました。
[root@master ~]# [root@master ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-hq4k9 1/1 Running 0 4m38s coredns-576cbf47c7-xh8h8 1/1 Running 0 4m38s etcd-master 1/1 Running 0 3m41s kube-apiserver-master 1/1 Running 0 3m51s kube-controller-manager-master 1/1 Running 0 3m37s kube-flannel-ds-9cdlv 1/1 Running 0 58s kube-proxy-29b6k 1/1 Running 0 4m38s kube-scheduler-master 1/1 Running 0 3m51s [root@master ~]# [root@master ~]# [root@master ~]# [root@master ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-hq4k9 1/1 Running 0 5m41s coredns-576cbf47c7-xh8h8 1/1 Running 0 5m41s etcd-master 1/1 Running 0 4m44s kube-apiserver-master 1/1 Running 0 4m54s kube-controller-manager-master 1/1 Running 0 4m40s kube-flannel-ds-9cdlv 1/1 Running 0 2m1s kube-flannel-ds-jbnrs 1/1 Running 0 8s kube-proxy-29b6k 1/1 Running 0 5m41s kube-proxy-vdnmv 1/1 Running 0 8s kube-scheduler-master 1/1 Running 0 4m54s [root@master ~]# [root@master ~]# [root@master ~]# [root@master ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-hq4k9 1/1 Running 0 6m28s coredns-576cbf47c7-xh8h8 1/1 Running 0 6m28s etcd-master 1/1 Running 0 5m31s kube-apiserver-master 1/1 Running 0 5m41s kube-controller-manager-master 1/1 Running 0 5m27s kube-flannel-ds-8vt88 0/1 Init:0/1 0 8s kube-flannel-ds-9cdlv 1/1 Running 0 2m48s kube-flannel-ds-jbnrs 1/1 Running 0 55s kube-flannel-ds-kdkww 1/1 Running 0 19s kube-proxy-29b6k 1/1 Running 0 6m28s kube-proxy-4m7ks 0/1 ContainerCreating 0 8s kube-proxy-ktklf 1/1 Running 0 19s kube-proxy-vdnmv 1/1 Running 0 55s kube-scheduler-master 1/1 Running 0 5m41s [root@master ~]# [root@master ~]# [root@master ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-hq4k9 1/1 Running 0 6m35s coredns-576cbf47c7-xh8h8 1/1 Running 0 6m35s etcd-master 1/1 Running 0 5m38s kube-apiserver-master 1/1 Running 0 5m48s kube-controller-manager-master 1/1 Running 0 5m34s kube-flannel-ds-8vt88 0/1 Init:0/1 0 15s kube-flannel-ds-9cdlv 1/1 Running 0 2m55s kube-flannel-ds-jbnrs 1/1 Running 0 62s kube-flannel-ds-kdkww 1/1 Running 0 26s kube-proxy-29b6k 1/1 Running 0 6m35s kube-proxy-4m7ks 1/1 Running 0 15s kube-proxy-ktklf 1/1 Running 0 26s kube-proxy-vdnmv 1/1 Running 0 62s kube-scheduler-master 1/1 Running 0 5m48s [root@master ~]#
kubectl joinで無事上がってきました!!(赤字のflannel) 続きます。