The Kubernetes Cluster package by CirrusGrid is an extremely complicated product, which includes multiple steps required to set up the solution. Herewith, each action can fail due to various reasons, which should be analyzed to prevent problem occurrence in the future. Below, we’ll overview the main troubleshooting steps during the different stages and for multiple log files:
Installation of a Kubernetes cluster is a complex but fully automated process, which already includes an error handling mechanism. The platform automatically processes the most common issues and shows their root cause directly in the dashboard. Herewith, for more complex issues, you can Send Report to the support team via the appropriate widget.
Such a report includes installation logs, error messages, and all other debug information required.
Also, the package automatically verifies all of the cluster components after installation. The relevant details can be viewed via the /var/log/k8s-health-check.log file on the master node. A dedicated utility script checks the health of the following components: Weave CNI Plugin, Ingres Controller, Metrics Server, Kubernetes Dashboard, Node Problem Detector, Monitoring Tools, Remote API, NFS Storage, Sample App.
If the health checker fails to verify the Running status of a component, the appropriate notification will be displayed in the installation success frame. Herewith, such a warning is not always caused by the cluster malfunction (e.g. deployments can be still in progress). You can run the kubectl get pods –all-namespaces command to check the pods’ state. If all of them are Running, your cluster is doing just fine. Otherwise, contact platform support and attach K8s related logs from the /var/log directory.
You can use kubectl or Kubernetes Dashboard to track and analyze events for the particular or all namespaces at one (the sufficient permissions are required):
Events in Kubernetes Dashboard
example output from the kubectl get events -n $namespace command
After scheduling pod(s) to run on a free node, you can follow the appropriate logs via:
For example, these logs can help find the root cause of the “Back-off restart failed container” event for your pods.
Powered by BetterDocs
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.