Connection tracking (“conntrack”) is a core feature of the Linux kernel’s networking stack. It allows the kernel to keep track of all logical network connections or flows, and thereby identify all of the packets which make up each flow so they can be handled consistently together.
Conntrack is an important kernel feature that underpins some key mainline use cases:
In addition, conntrack normally improves performance (reduced CPU and reduced packet latencies) since only the first packet in a flow needs to go through the full network stack processing to work out what to do with it. See the “Comparing kube-proxy modes” blog for one example of this in action.
However, conntrack has its limits…
The conntrack table has a configurable maximum size and, if it fills up, connections will typically start getting rejected or dropped. For most workloads, there’s plenty of headroom in the table and this will never be an issue. However, there are a few scenarios where the conntrack table needs a bit more thought:
There are some niche workload types fall into these categories. In addition, if you’re in a hostile environment then flooding your server with lots of half-open connections can be used as a denial-of-service attack. In both cases, conntrack can become the limiting bottleneck in your system. For some scenarios tuning conntrack may be sufficient to meet your needs by increasing the conntrack table size or reducing conntrack timeouts (but if you get this tuning wrong it can lead to a lot of pain). For other scenarios, you need to bypass conntrack for the offending traffic.
To give a concrete example, one large SaaS provider we worked with had a set of memcached servers running on bare metal servers (not virtualized or containerized) each handling 50k+ short-lived connections per second. This is way more than a standard Linux config can cope with.
They had experimented with tuning conntrack configuration to increase table sizes and reduce timeouts, but the tuning was fragile, the increased RAM use was a significant penalty (think GBytes!), and the connections were so short-lived that conntrack was not giving its usual performance benefits (reduced CPU or packet latencies).
Instead, they turned to Calico. Calico’s network policies allow you to bypass conntrack for specific traffic (using the doNotTrack flag). This gave them the performance they needed, plus the additional security benefits that Calico brings.
We tested a single memcached server pod and a multitude of client pods running on remote nodes so we could drive very high connections per second. The memcached server pod host had 8 cores and a 512k entry conntrack table (standard setting for the size of the host). We measured performance differences between: no network policy; Calico normal network policy; and Calico do-not-track network policy.
In the first test we limited the connections to 4,000 per second so we could focus on CPU differences. There was no measurable difference in performance no policy and normal policy, but do-not-track policy reduced CPU usage by around 20%.
In the second test, we pushed as many connections as our clients could muster and measured the maximum number of connections per second the memcached server was able to process. As expected, no policy and normal policy both hit the conntrack table limit at just over 4,000 connections per second (512k / 120s = 4,369 connections/s). With do-not-track policy in place our clients pushed 60,000 connections per second without hitting any issues. We are confident we could have pushed beyond this by spinning up even more clients, but felt the numbers were already enough to illustrate the point of this blog!
Conntrack is an important kernel feature. It’s good at what it does. Many mainline use cases depend on it. However, for some niche scenarios, the overhead of conntrack outweighs the normal benefits it brings. In these scenarios, Calico network policy can be used to selectively bypass conntrack while still enforcing network security. For all other traffic, conntrack continues to be your friend!
If you enjoyed this blog then you may also like:
Get updates on blog posts, new releases and more!