Navigating the security challenges of multi-tenancy in a cloud environment

Multi-tenancy can maximize the number of resources that are utilized in a cluster by sharing these resources between different groups, teams, or customers. However, boundaries must be placed to avoid problems associated with resource-sharing. On top of that, in a multi-tenant cluster, the number of security policies might gradually grow to the point where a slight misconfiguration could cause major security problems, performance issues, and service disruptions.

In this blog post, we will focus on multi-tenancy issues such as bandwidth shortage, security policy scaling, privacy impacts, and suggest a few solutions that you can deploy to solve them in your environment. We will also look at how an eBPF-based security design can offer better performance and help you navigate the complex multi-tenant environment with ease.

What is multi-tenancy?

Technologies such as virtualization, containerization, or any other technologies that allow a range of different workloads to share the underlying hardware resources, all have a common goal—allocate resources as efficiently as possible and make the most of the available hardware. However, it is common for workloads that are running in such an environment to not fully utilize all the potential power that the hardware can offer, and in many cases, leave a significant amount of these resources idle.

In order to make the most of these unused resources, it is possible to share the environment with others, allowing them to tap into the unused potential of our clusters and get a better return on investment. In Kubernetes, this practice is called multi-tenancy.

Addressing networking issues in a multi-tenant environment

In a multi-tenant environment, bandwidth shortage can result in slow communication and limited resources for certain workloads, while unencrypted communication can lead to packet sniffing, which can be used to intercept and read sensitive information. These issues must be closely managed to ensure a secure and efficient multi-tenant environment.

Bandwidth restrictions

While we usually don’t have that much concern about network bandwidth, in a busy cluster with multiple tenants and workloads, bandwidth can become a scarce resource.

The following code block is an example of a cluster using Calico and the bandwidth management plugin:

cat /etc/cni/net.d/10-calico.conflist

{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
...
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    },
}

In multi-tenant environments, in order to manage the available bandwidth better and implement a fair sharing model between different groups of users, we can write policy rules for resources by implementing policy annotations.

The following example shows the annotations that can be used to limit a pod’s bandwidth in Kubernetes:

apiVersion: v1

kind: Pod

metadata:

 annotations:

   kubernetes.io/ingress-bandwidth: 1M

   kubernetes.io/egress-bandwidth: 1M

IP address restrictions

IPAM is another simple component that could be used to exercise more control over tenants in a cluster. Pods acquire IP addresses by communicating with the IPAM, which can divide tenants into different segments using IP pools and labels. This segmentation provides a fine-grained management capability for cluster administrators to impose an identity on networking traffic that leaves the cluster and goes to external networking devices.

On top of sorting tenants into different clusters, IP segmentation provides the ability to sort tenants into different clusters. Additionally, it enables the implementation of bandwidth restrictions and other types of control on each tenant’s network capabilities, which can even expand to devices that are not part of the cluster or cannot integrate with the Kubernetes API server.

The following code block shows an imposed limitation on a range of IP addresses that are assigned to noisy neighbors in a busy cluster:

tc qdisc add dev eth0 root handle 1: cbq avpkt 1000 bandwidth 100mbit

tc filter add dev eth0 parent 1: protocol ip prio 16 u32 match ip src 192.168.0.0/16 flowid 1:1Ωz

Route propagation

Some CNIs offer dynamic routing via routing protocols. A routing protocol is a mutual agreement on the best path that packets should travel on to reach their desired destinations. There are a few routing protocols out there and each has its own strengths and weaknesses. However, at the very core, they all aim to enable different devices to share information about the addresses they have recorded in their routing table. By implementing a routing propagation strategy, it’s possible to add an extra layer of protection against unauthorized devices from viewing sensitive resources.

The following resource shows how route propagation can be limited for an IP pool using Calico:

apiVersion: projectcalico.org/v3

kind: IPPool

metadata:

  name: my.ippool-1

spec:

  disableBGPExport: true

...

Encryption

In multi-tenant environments, we can use namespaces to isolate workloads. However, If a malicious user gains access to a containerized environment via a misconfigured application or by exploiting a vulnerability inside the system, they may be able to observe the data that is being transmitted within the cluster, which is a privacy concern. This is a critical issue to be aware of in any multi-user environment, as a vulnerability could potentially impact multiple businesses at once.

In such a scenario, encryption could be a great tool to prevent unauthorized people from observing sensitive information. Mutual TLS is a form of encryption that was mainstreamed in Kubernetes with the invention of service mesh.

Read our documentation, Encrypt in-cluster pod traffic, to learn more about mutual transport layer security and service mesh.

A service mesh deploys sidecar containers in a pod and intercepts the traffic before it can leave. Then it uses mTLS to establish secure communication with digital certificates, between the resources, and can seamlessly encrypt all communications leaving or entering a pod.

Check out Enforce network policy for Istio to learn more about mutual TLS and how to deploy it in your cluster.

Security policy

Security policy resources are descriptive resources that can manipulate the normal flow of traffic. These resources are well-known to anyone who has worked with firewalls or security devices. In a multi-tenancy environment, these resources can be used to allow tenants to have limited control over their own network segments without compromising the security of other tenants.

Role-based access control

Kubernetes role-based access control (RBAC) is a form of authentication and authorization that ensures users and workloads in a cluster can only access the resources that they are entitled to. It can grant or restrict certain permissions such as reading, writing, or executing certain resources like pods, services, secrets, network policies, security policies, etc.

For example, an administrator could use Kubernetes network policies to allow tenants to have the required network permissions within their namespace.

Next, CNI security policy resources, such as Calico’s NetworkPolicy resource, can be deployed to a namespace. It offers more features that are suitable for a tenant’s security team, so that they can implement their own standards and rules.

Finally, the cluster administrator can use global resources such as GlobalNetworkPolicy to establish a standard security model for the cluster as a whole.

The following image illustrates the model that we just discussed or better known as the shift-left policy concept:

A diagram shows Kubernetes and Calico network policies, illustrating the policy evaluation order and rules set by different dev teams.

If you’re interested in RBAC, read our blog post, How to integrate Kubernetes RBAC and Calico to achieve “Shift-Left” Security to learn more.

Policy efficiency

Flexibility is often highlighted as one of the advantages of a cloud environment. But since this flexibility also impacts the network and other parts of your setup, it can create new challenges that must be addressed in a cloud-native format.

For example, security policies are an effective way to secure a cluster, but in a multi-tenant environment, misconfigured policies can consume too much hardware resources. To address this, technologies like the Calico eBPF data plane can be used. This data plane converts policies into efficient eBPF programs and sends them directly to the kernel, allowing for a large number of policies to be implemented without worrying about resource consumption. Here is a comparison regarding the policy efficiency of the two Linux dataplanes.

Conclusion

In conclusion, multi-tenancy in Kubernetes can help you maximize resource utilization and achieve a better return on investment by sharing a cluster between multiple tenants. However, it also presents several challenges such as bandwidth shortage, security policy scaling, and privacy impacts. In order to address these challenges, it’s important to have a well-designed environment that takes these factors into consideration.

By implementing the right tools and policies, multi-tenancy in Kubernetes can provide a positive experience for all tenants and improve overall resource utilization in the cluster.

Not sure what to learn next? Check out this panel, Cyber security challenges in Cloud native world and the role of shift left to mitigate them.