Calico is an open source networking and network security solution for containers, virtual machines, and native host-based workloads. Calico supports a broad range of platforms including Kubernetes, OpenShift, Docker EE, OpenStack, and bare metal. In this blog, we will focus on Kubernetes pod networking and network security using Calico.
Calico uses etcd as the back-end datastore. When you run Calico on Kubernetes, you can use the same etcd datastore through the Kubernetes API server. This is called a Kubernetes backed datastore (KDD) in Calico. The following diagram shows a block-level architecture of Calico.
Calico-node runs as a Daemonset, and has a fair amount of interaction with the Kubernetes API server. It’s easy for you to profile that by simply enabling audit logs for calico-node. For example, in my kubeadm cluster, I used the following audit configuration
To set the context, this is my cluster configuration.
If you ignore the license key API calls from calico-node, you will see that the API calls from calico-node are an order of magnitude smaller than the API calls from Typha. I also sent the logs over to Elasticsearch, and here’s a simple plot for comparison. As you can see, there’s a massive difference between the API calls generated from Typha vs. Calico. (Kibana JSON available here).
The key point is that API calls from Typha will remain constant as you scale the cluster. If you watch the raw log pattern from gist, above, the bulk of API calls from Typha are watch events. Typha is a read-only system. It watches for changes in various resources, and does a fan-out to all its clients (calico-node). This significantly reduces the load on the API server. An instance of Typha sits between the datastore (such as the Kubernetes API server) and many instances of Felix.
Another way to understand this is to simply generate a clusterRole from the audit logs of Typha and calico-node. When reviewing the clusterRoles, you will see that Typha is primarily doing get, list, and watch. Whereas calico-node does create, update, delete, patch, get, and list on resources. The watch events, which are very frequent given the dynamic nature of configuration and pods, are handled by Typha.
Without Typha, every calico-node would have to register its own watch with the API Server, and the load on the API server would multiply as you scale up the number of nodes. By having Typha, all the watch events are off-loaded to Typha and read only once from the API server. Hence Typha is not optional, but is a necessary component of your Calico deployment for any decent-sized production cluster.