Guides

Prometheus Operator

What Is the Prometheus Operator?

Kubernetes lets you use software extensions, called operators, to manage applications automatically. In Kubernetes, an operator is a controller that works with the Kubernetes API to handle application processes without human intervention.

The Prometheus operator automates the configuration and management of the Prometheus monitoring stack that runs on a Kubernetes cluster. Here are key features of the Prometheus operator:

  • Custom resources – The controller can use custom resources to deploy and manage Alertmanager, Prometheus, and other related components.
  • Deployment configuration – The controller provides a Kubernetes-native resource to configure persistence, replicas, and retention policies.
  • Prometheus target configuration – The controller automatically generates configurations for scraping metrics from targets, based on Kubernetes label queries. This eliminates the need to learn a configuration language specific to Prometheus.

Related content: Read our guide to Prometheus for Kubernetes

In this article:

Custom Resource Definitions (CRDs)

The Prometheus operator includes several custom resource definitions (CRDs).

Prometheus

The Prometheus CRD can declaratively define a Prometheus configuration to run on a Kubernetes cluster. It offers several options for configuring persistent storage, replication, and Alertmanager, to which a deployed Prometheus instance can send alerts.

The operator deploys an appropriately configured StatefulSet for every Prometheus resource, in the same namespace. Each Prometheus pod mounts a secret called <prometheus-name>, which contains the Prometheus configuration.

Alertmanager

The Alertmanager CRD declaratively defines settings for Alertmanagers running on a Kubernetes cluster. It offers several options for configuring persistent storage and replication.

The Prometheus operator deploys a StatefulSet for every Alertmanager resource, in the same namespace. Each Alertmanager pod mounts a secret called <alertmanager-name>, which contains the configuration file in the alertmanager.yaml key.

ServiceMonitor

The ServiceMonitor CRD declaratively defines how to monitor a dynamic set of services. You can use label selections to define the services to be monitored in the desired configuration.

This allows organizations to define rules for how metrics are published. ServiceMonitor will follow these rules to automatically detect new services without system reconfigurations.

For Prometheus to monitor your Kubernetes applications, you must have an Endpoint object. An Endpoint object is essentially a list of IP addresses.

Endpoint objects are typically populated with Service objects. The Service object uses the label selector to detect a pod and adds it to the Endpoint object. A Service can expose one or multiple service ports, typically backed with a list of multiple Endpoints pointing to pods.

The Prometheus operator introduces the ServiceMonitor object, which detects these Endpoint objects and tells Prometheus to monitor the pods they list.

PodMonitor

The PodMonitor CRD can declaratively define how to monitor a dynamic group of pods. Its role is to define configurations that specify how Prometheus servers should discover and monitor pods. A pod is a collection of one or multiple containers that can publish Prometheus metrics to multiple ports.

You can use label selection to define the pods you want to monitor in your desired configuration. This lets you introduce rules for how metrics are published. Based on these rules, you can automatically discover new pods without reconfiguring the system.

Probe

A probe CRD declaratively defines how to monitor a set of inputs and static targets. In addition to targets, probe objects require a prober—this is a service that monitors targets and generates metrics for Prometheus to scrape.

PrometheusRule

The PrometheusRule CRD can declaratively define Prometheus rules used by one or multiple instances of Prometheus. You can save and apply alerting and recording rules as YAML files, which are loaded dynamically without requiring a restart.

AlertmanagerConfig

The AlertmanagerConfig CRD can declaratively define parts of the Alertmanager configuration. This allows you to route notifications to custom listeners and set blocking rules.

AlertmanagerConfig can be defined at the namespace level and provides an aggregated configuration for Alertmanagers.

How to Install the Prometheus Operator

There are three ways to set up the Prometheus monitoring stack in Kubernetes:

Installing Prometheus Operator Manually

If you are extremely knowledgeable about Prometheus components and their prerequisites, you can manually deploy the YAML specification file for each component. However, you would need to correctly deploy all Secrets and ConfigMaps in the right order. This is time consuming and error prone, and is generally not recommended. Another downside is that your deployment will be difficult to replicate in other environments.

Deploying Prometheus Operator with kube-prometheus

An easier way to set up Prometheus Operator is to deploy it using the kube-prometheus command line.

kube-prometheus deploys the Prometheus Operator and schedules a Prometheus instance called prometheus-k8s with alerts and rules applied by default.

To deploy Prometheus Operator with kube-prometheus:

1. Get a compiled version of Kubernetes manifests by cloning the kube-prometheus repo from GitHub:

git clone 
https://github.com/prometheus-operator/kube-prometheus.git

2. Change into the project’s root directory and create a namespace and CRDs. Wait until they are available before proceeding. It is important to create the CRDs and namespace first to prevent race conditions when you deploy the monitoring components.

3. Run these commands to create the required resources in your cluster:

kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/

Deploying the Operator Using the Helm Chart

Another option for deploying the Prometheus Operator is to use the Helm charts maintained by the Prometheus community. Helm performs an initial installation of Prometheus Operator together with the following additional components:

  • Prometheus
  • Alertmanager
  • Grafana
  • Node-exporter and other exporters

The Prometheus Operator then manages the entire lifecycle of these custom resources. The components can immediately work together after installation to perform basic cluster monitoring.

To deploy Prometheus Operator using the Helm chart, run these commands:

helm repo add prometheus-community 
https://prometheus-community.github.io/helm-charts

helm repo update

helm install prometheus
prometheus-community/kube-prometheus-stack

Kubernetes Monitoring and Observability with Calico

Because Kubernetes workloads are highly dynamic, ephemeral, and are deployed on a distributed and agile infrastructure, it poses a unique set of monitoring and observability challenges. As such, Kubernetes-native monitoring and observability is required to monitor and troubleshoot communication issues between microservices in the Kubernetes cluster.

More specifically, context about microservices, pods, and namespaces is needed so that multiple teams can collaborate effectively to identify and resolve issues. Calico helps rapidly pinpoint and resolve performance, connectivity, and security policy issues between microservices running on Kubernetes clusters across the entire stack.

Calico Cloud and Calico Enterprise offer the following key features for Kubernetes monitoring and observability:

  1. Dynamic service and threat graph – Provides visibility across the stack from the network layer to the application layer, showing a runtime view of how namespaces, services, and pods are operating in your cluster. A point-to-point, topographical representation of traffic flow and policy shows how workloads within the cluster are communicating, and across which namespaces, with the ability to filter resources, save views, and troubleshoot service issues.
  2. DNS dashboard – Helps accelerate DNS-related troubleshooting and problem resolution in Kubernetes environments by providing an interactive UI with exclusive DNS metrics.
  3. Dynamic packet capture – Captures packets from a specific pod or collection of pods on-demand, with specified packet sizes and duration, in order to troubleshoot performance hotspots and connectivity issues faster.
  4. Application-level observability – Provides a centralized, all-encompassing view of service-to-service traffic in the cluster to detect anomalous behavior like attempts to access applications or restricted URLs, and scans for particular URLs. A Layer 7 dashboard provides a view of HTTP communication across the cluster, with summaries of top URLs, request duration, response codes, and volumetric data for each service.
  5. Unified Controls – A single, unified management plane provides a centralized point-of-control for unified security and observability on multiple clouds, clusters, and distros. Users can monitor and observe across environments with a single pane of glass.

Learn more about Calico for Kubernetes monitoring and observability

Join our mailing list​

Get updates on blog posts, workshops, certification programs, new releases, and more!