Kubernetes monitoring tools help you gain visibility into your containers, pods, and clusters. Monitoring tools help you ensure reliability and troubleshoot issues as they occur, monitor and enforce security, manage costs, fine-tune performance, and minimize chargebacks.
Monitoring is especially important in a containerized environment, because resources are ephemeral, the environment is complex, and it can be difficult to identify and troubleshoot problems.
Traditional monitoring tools are typically not effective in a containerized environment. A new generation of cloud native monitoring tools has emerged—these tools can be deployed as part of Kubernetes clusters and are able to gather relevant metrics from across the Kubernetes environment.
Here are several reasons your organization needs a robust monitoring strategy for Kubernetes:
The following open-source tools are at the forefront of cloud-native monitoring technology. Let’s briefly review their features and capabilities.
The Kubernetes dashboard is the primary web-based user interface for monitoring Kubernetes. It provides a reliable way to visualize important information from the containers and pods running in your clusters. It is an integral part of the Kubernetes environment, allowing you to view and handle all aspects of monitoring a Kubernetes cluster.
The Kubernetes dashboard provides metrics and visualizations of:
It also allows you to make changes to the Kubernetes environment:
GitHub repo: https://github.com/kubernetes/dashboard
Prometheus is a popular open-source monitoring tool for Kubernetes. Prometheus was originally a SoundCloud venture and is now one of only a few Graduated Projects managed by the Cloud Native Computing Foundation (CNCF). The tool has evolved into a standard for monitoring Kubernetes. Prometheus retrieves resource metrics using time series from specific endpoints.
Prometheus is divided into three components that perform different tasks: the Prometheus server, AlertManager, and exporters. The Prometheus server handles service deployment, extracts metrics from exporters, and stores them in a database for monitoring. AlertManager is used to set up alerts and send notifications when certain triggers are activated. Exporters are independent containers that use APIs to create and export metrics.
Other features of Prometheus include:
GitHub repo: https://github.com/prometheus/prometheus
Jaeger is an end-to-end distributed tracing solution that was open sourced by Uber Engineering, and is currently an incubating project with the CNCF. It lets you monitor and troubleshoot transactions in complex distributed systems. In modern microservices architectures, most operational issues are within the scope of networking and observability.
If there is a service failure, you cannot determine how requests passed from one service to another over the network when completing a single business transaction. This makes debugging very difficult. Jaeger uses tracing to analyze root cause, optimize performance and latency, and monitor distributed transactions. Jaeger works with Istio, a popular service mesh implementation open sourced by Google.
GitHub repo: https://github.com/jaegertracing/jaeger-kubernetes
The ELK stack is a popular open-source solution for enterprise search and log management, which can handle Kubernetes logs as well. You can use it for both monitoring and log management.
ELK consists of a set of three tools and one data collection agent:
ELK has beats that support Kubernetes and Docker, with auto-discovery capabilities. These beats help you monitor applications and system-level performance by collecting many types of logs and metrics.
GitHub repo: https://github.com/deviantony/docker-elk
Related content: Read our guide to Kubernetes logging.
kubewatch helps you keep track of certain Kubernetes events. It then sends notifications to collaboration tools like PagerDuty and Slack. This tool looks for changes occurring to specific pre-specified Kubernetes resources, such as pods, DaemonSets, deployments, ReplicaSets, replication controllers, secrets, configuration maps, and services. kubewatch is easy to configure and can be deployed manually, or automatically via Helm charts.
GitHub repo: https://github.com/bitnami-labs/kubewatch
The cAdvisor agent can help you collect, process, and export information about containers running in your environment. cAdvisor is deployed on the node level, not per pod. It can auto-discover all containers running on a particular machine and collect system metrics like CPU, network, and memory.
cAdvisor is offered as a built-in, native monitoring feature of Kubernetes. It is also integrated into the kubelet binary, meaning it exists by default on every Kubernetes node. Another advantage of cAdvisor is that it exposes Prometheus metrics out of the box, making it easy to use. However, it does not offer robust functionality in comparison to end-to-end monitoring solutions.
cAdvisor also supports workloads running outside Docker. It shows resource usage of containers and can display metrics on a web-based user interface.
GitHub repo: https://github.com/google/cadvisor
Open-source tools are a great way to start your monitoring journey, but they have their limitations. Calico Cloud and Calico Enterprise offers the following advanced features for Kubernetes observability, which go beyond open-source cloud-native monitoring tools: