Linux - ContainersThis forum is for the discussion of all topics relating to Linux containers. Docker, LXC, LXD, runC, containerd, CoreOS, Kubernetes, Mesos, rkt, and all other Linux container platforms are welcome.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
It's basically a collection of terraform scripts that automatically deploys all the necessary VMs, and then deploys Kubernetes via Kubespray.
To help with trouble-shooting, I modified the script so that it only deploys the VMs (NOT kubernetes).
I now manually run kubespray to deploy the kubernetes cluster.
Unfortunately it keeps getting stuck during initialization of master nodes.
Code:
TASK [kubernetes/master : kubeadm | Initialize first master] ***************************************************************************************************
Wednesday 17 July 2019 13:14:15 +0600 (0:00:00.414) 0:25:23.108 ********
FAILED - RETRYING: kubeadm | Initialize first master (3 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (2 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (1 retries left).
fatal: [k8s-kubespray-master-0]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "600s", "600s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--experimental-upload-certs", "--skip-phases=addon/coredns"], "delta": "0:05:09.774195", "end": "2019-07-17 13:35:22.429247", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2019-07-17 13:30:12.655052", "stderr": "\t[WARNING Port-6443]: Port 6443 is in use\n\t[WARNING Port-10251]: Port 10251 is in use\n\t[WARNING Port-10252]: Port 10252 is in use\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING Port-10250]: Port 10250 is in use\nerror execution phase wait-control-plane: couldn't initialize a Kubernetes cluster", "stderr_lines": ["\t[WARNING Port-6443]: Port 6443 is in use", "\t[WARNING Port-10251]: Port 10251 is in use", "\t[WARNING Port-10252]: Port 10252 is in use", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING Port-10250]: Port 10250 is in use", "error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster"], "stdout": "[init] Using Kubernetes version: v1.14.3\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Activating the kubelet service\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation\n[certs] External etcd mode: Skipping etcd/peer certificate authority generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate authority generation\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[kubelet-check] Initial timeout of 40s passed.\n\nUnfortunately, an error has occurred:\n\ttimed out waiting for the condition\n\nThis error is likely caused by:\n\t- The kubelet is not running\n\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)\n\nIf you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:\n\t- 'systemctl status kubelet'\n\t- 'journalctl -xeu kubelet'\n\nAdditionally, a control plane component may have crashed or exited when started by the container runtime.\nTo troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.\nHere is one example how you may list all Kubernetes containers running in docker:\n\t- 'docker ps -a | grep kube | grep -v pause'\n\tOnce you have found the failing container, you can inspect its logs with:\n\t- 'docker logs CONTAINERID'", "stdout_lines": ["[init] Using Kubernetes version: v1.14.3", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Activating the kubelet service", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation", "[certs] External etcd mode: Skipping etcd/peer certificate authority generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate authority generation", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[kubelet-check] Initial timeout of 40s passed.", "", "Unfortunately, an error has occurred:", "\ttimed out waiting for the condition", "", "This error is likely caused by:", "\t- The kubelet is not running", "\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)", "", "If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:", "\t- 'systemctl status kubelet'", "\t- 'journalctl -xeu kubelet'", "", "Additionally, a control plane component may have crashed or exited when started by the container runtime.", "To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.", "Here is one example how you may list all Kubernetes containers running in docker:", "\t- 'docker ps -a | grep kube | grep -v pause'", "\tOnce you have found the failing container, you can inspect its logs with:", "\t- 'docker logs CONTAINERID'"]}
NO MORE HOSTS LEFT *********************************************************************************************************************************************
to retry, use: --limit @/home/akij.net/ashfaqur.corp/terraform_custom/terraform-vsphere-kubespray-master/ansible/kubespray/cluster.retry
PLAY RECAP *****************************************************************************************************************************************************
k8s-kubespray-master-0 : ok=300 changed=85 unreachable=0 failed=1
k8s-kubespray-master-1 : ok=291 changed=82 unreachable=0 failed=0
k8s-kubespray-master-2 : ok=291 changed=82 unreachable=0 failed=0
k8s-kubespray-worker-0 : ok=235 changed=59 unreachable=0 failed=0
k8s-kubespray-worker-1 : ok=221 changed=59 unreachable=0 failed=0
k8s-kubespray-worker-2 : ok=221 changed=59 unreachable=0 failed=0
localhost : ok=1 changed=0 unreachable=0 failed=0
Wednesday 17 July 2019 13:35:22 +0600 (0:21:06.690) 0:46:29.799 ********
===============================================================================
kubernetes/master : kubeadm | Initialize first master ------------------------------------------------------------------------------------------------ 1266.69s
download : file_download | Download item -------------------------------------------------------------------------------------------------------------- 213.79s
container-engine/docker : ensure docker packages are installed ----------------------------------------------------------------------------------------- 92.24s
download : container_download | download images for kubeadm config images ------------------------------------------------------------------------------ 87.91s
bootstrap-os : Install python -------------------------------------------------------------------------------------------------------------------------- 78.73s
download : file_download | Download item --------------------------------------------------------------------------------------------------------------- 69.61s
download : file_download | Download item --------------------------------------------------------------------------------------------------------------- 56.68s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 50.98s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 34.88s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 32.75s
kubernetes/preinstall : Install packages requirements -------------------------------------------------------------------------------------------------- 27.50s
bootstrap-os : Install dbus for the hostname module ---------------------------------------------------------------------------------------------------- 27.49s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 26.50s
container-engine/docker : ensure docker-ce repository is enabled --------------------------------------------------------------------------------------- 25.22s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 23.29s
etcd : Gen_certs | Write etcd master certs ------------------------------------------------------------------------------------------------------------- 19.63s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 17.27s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 15.06s
download : container_download | Download containers if pull is required or told to always pull (all nodes) --------------------------------------------- 12.46s
etcd : reload etcd ------------------------------------------------------------------------------------------------------------------------------------- 11.78s
Some output of command 'journalctl -xeu kubelet' from inside the failed master node.
Code:
E0717 13:51:55.536060 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.593387 12293 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://172.17.17.143:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-kubespray-master-0&limit=500&resourceVersion=0: dial tcp 172.17.17.143:6443: connect: no route to host
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.593734 12293 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Get https://172.17.17.143:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 172.17.17.143:6443: connect: no route to host
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.594022 12293 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://172.17.17.143:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourceVersion=0: dial tcp 172.17.17.143:6443: connect: no route to host
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.594357 12293 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://172.17.17.143:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&resourceVersion=0: dial tcp 172.17.17.143:6443: connect: no route to host
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.595058 12293 event.go:200] Unable to write event: 'Patch https://172.17.17.143:6443/api/v1/namespaces/default/events/k8s-kubespray-master-0.15b221487bd06f61: dial tcp 172.17.17.143:6443: connect: no route to host' (may retry after sleeping)
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.595654 12293 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Get https://172.17.17.143:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-kubespray-master-0&limit=500&resourceVersion=0: dial tcp 172.17.17.143:6443: connect: no route to host
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.637619 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.738986 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: W0717 13:51:55.801163 12293 cni.go:213] Unable to update cni config: No networks found in /etc/cni/net.d
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.839245 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:55 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:55.939509 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.039774 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.140942 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.242142 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.342497 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.442820 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.468218 12293 eviction_manager.go:247] eviction manager: failed to get summary stats: failed to get node info: node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.543874 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.644389 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.702088 12293 kubelet_node_status.go:283] Setting node annotation to enable volume controller attach/detach
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.702990 12293 vsphere.go:857] The vSphere cloud provider does not support zones
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.703936 12293 setters.go:73] Using node IP: "172.17.17.133"
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.707944 12293 kubelet_node_status.go:468] Recording NodeHasSufficientMemory event message for node k8s-kubespray-master-0
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.708010 12293 kubelet_node_status.go:468] Recording NodeHasNoDiskPressure event message for node k8s-kubespray-master-0
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: I0717 13:51:56.708043 12293 kubelet_node_status.go:468] Recording NodeHasSufficientPID event message for node k8s-kubespray-master-0
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.746177 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.849506 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:56 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:56.950907 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:57 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:57.051278 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Jul 17 13:51:57 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:57.132686 12293 kubelet.go:2170] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jul 17 13:51:57 k8s-kubespray-master-0 kubelet[12293]: E0717 13:51:57.151625 12293 kubelet.go:2244] node "k8s-kubespray-master-0" not found
Full output of command 'journalctl -xeu kubelet' from inside the failed master node.
From the log you posted, first thing that jumps out is network issue:
Yep.., you are spot on
I finally figured out that it was a Terraform (configuration) issue, NOT a Kubespray/kubernetes issue.
A NIC by the name ens192 was set to be configured by terraform in the scripts.
But my template-VM's NIC was ens160.
This caused 'keepalived' service to fail in the HAProxy servers; making the VIP inaccessible.
Once I changed the NIC name in the (Terraform) configuration files, things worked out fine
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.