Securing Kubernetes Endpoints with Open Source Project Calico

This blog concludes a three part series on Securing Host Endpoints with Project Calico by explaining how to secure endpoints in Kubernetes. The first two installments referenced below provide background on the technical principles applied in this blog.

Before we dive into the technical details, let’s zoom out and review what we are trying to achieve.

Which Bits Are You Going to Trust?

Ken Thompson memorably implied that it is far more important to trust people than software in his A.M. Turing award acceptance speech, “Reflections on Trusting Trust”. Ken and his contemporaries, hackers before hacking had been fully established in our lexicon, made a game of challenging each other by proposing interesting programming exercises. In his speech, Ken reminisced on one such exercise that cleverly modified the C compiler to insert a backdoor into the login program.

Ken’s amusements have frightening consequences as trojan horses and other malignancies can be hidden in any layer of the stack including the UEFI firmware, boot loader, kernel, user libraries, and applications. Ubiquitous access to open source software mixed with easy deployment of cloud native workloads is a recipe for disaster without an appropriate security posture against constantly evolving threats.

Traditionally, we’ve directed blame for security vulnerabilities toward compute platforms when in reality, our data centers and corporate networks are chalk full of connected devices running firmware for which we may or may not have the source code. Trusting people is a good theory when applied to people, but falls short when you consider the sea of transient dependencies on which our systems are built. Pragmatism demands that we mitigate these threats at multiple points in the stack and choose which bits we are going to trust at every layer.

Separating Concerns

When it comes to enforcing network policy and restricting when and how applications can exchange data, the bits we trust are going to be responsible for creating micro-segmentation between applications.

Before the advent of workload virtualization, application segregation was accomplished by placing servers on a policy specific VLAN. Only applications within the same VLAN could communicate with each other unless a specific hole (policy rule) was punched in the static barrier (typically a firewall) sitting between the two VLANs.

Using a layer-2 reachability concern such as arranging workloads into VLANs for isolation artificially induces a hub and spoke topology which may have worked for statically deployed and monolithic workloads. The shift to microservices and increasing east-west datacenter traffic demands a mesh topology and isolation primitives that can adapt to constantly changing workloads.

The below sections demonstrate how to separate these concerns by securely deploying a set of services on Kubernetes. In addition, we will use the same mechanisms to secure the Kubernetes control plane.

Deploying and Securing Workloads

Enforcing network security policies in a Kubernetes cluster requires installation of a policy provider such as Project Calico. We’ve provided tutorials to help get your started at http://docs.projectcalico.org/v3.0/getting-started/kubernetes/).

Once you have Project Calico installed, you can start securing workloads (Deployments, DaemonSets, StatefulSets, and so forth) using Kubernetes Network Policies.

Suppose we have a service that serves up REST endpoints on port 80 to another service — let’s call them foo and bar respectively.

We could create a deployment for each service like this:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: foo
 labels:
   app: baz
   component: foo
spec:
 replicas: 3
 selector:
   matchLabels:
     app: baz
     component: foo
 template:
   metadata:
     labels:
       app: baz
       component: foo
       foo: producer
   spec:
     containers:
     - name: foo
       image: foo:1.0.0
       ports:
       - name: http
         containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: bar
 labels:
   app: baz
   component: bar
spec:
 replicas: 3
 selector:
   matchLabels:
     app: baz
     component: bar
 template:
   metadata:
     labels:
       app: baz
       component: bar
       foo: consumer
   spec:
     containers:
     - name: bar
       image: bar:1.0.0
       ports:
       - name: http
         containerPort: 80

Typically, you would put a define a service for each deployment as well like this:

kind: Service
apiVersion: v1
metadata:
 name: foo
spec:
 selector:
   app: baz
   component: foo
 ports:
 - name: http
   protocol: TCP
   port: 80
   targetPort: http
---
kind: Service
apiVersion: v1
metadata:
 name: bar
spec:
 selector:
   app: baz
   component: bar
 ports:
 - name: http
   protocol: TCP
   port: 80
   targetPort: http

If we want to restrict traffic from the to TCP on port 80, we should deploy a network policy that both limits the source (bar) and the target (foo).

First, we limit bar to only making egress calls to foo. Additionally, we would also need to define ingress rules into bar for clients connecting to it:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
 name: foo-consumer
spec:
 podSelector:
   matchLabels:
     foo: consumer
 policyTypes:
 - Egress
 egress:
 - to:
   - podSelector:
       matchLabels:
         foo: producer

Next, we limit traffic to foo by only accepting sources that have the foo=consumer label set:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
 name: foo-producer
spec:
 podSelector:
   matchLabels:
     foo: producer
 policyTypes:
 - Ingress
 ingress:
 - from:
   - podSelector:
       matchLabels:
         foo: consumer

Additionally, we would also need to define ingress rules into bar for clients connecting to it (such as a web front-end for example) and these policies would follow the same pattern — setting up appropriate producer and consumer pairs.

Composing Security Policies

Creating policies in pairs as done above allows us to mix and match the roles various workloads are performing. For example, a workload could be dependent on multiple services. By labeling it with multiple consumer roles, an aggregate policy can be produced for any workload through composition.

Exposing Services with NodePorts

In our example, we configured a service for each type of workload. Policies are not tied to services, but instead are tied to the underlying workloads. In Kubernetes, when one service calls another via its service name or virtual IP (VIP), the kube-proxy component running on each server will perform a destination network address translation (DNAT) for the request directing it to one of the services healthy endpoints.

For in-cluster traffic, this means the source of the request is the originating pod IP and the destination is the target pod IP. Since both workloads are managed by Kubernetes, our policies above use labels to abstract away these IPs and make the policies applicable to a larger class of workloads.

For requests originating externally from the cluster however, the same rules do not apply. If we expose a service for external consumption by other services using a nodeport, the well-known endpoint for those consumers is a cluster edge node IP (possibly load balanced by an external device) and nodeport. Relying on internal cluster IP addresses and target ports will be ineffective since these are obscured after kube-proxy has handled and forwarded the request.

Creating a Host Endpoint and PreDNAT policy

To write a policy that secures access to this nodeport exposed service, we need to create a host endpoint and an associated policy that will be evaluated before an DNAT translation occurs.

First, we create a host endpoint using the below syntax (these resources are native to Project Calico and are not part of Kubernetes network policy):

apiVersion: projectcalico.org/v3
kind: HostEndpoint
metadata:
 name: node1
 labels:
   role: ingress
 spec:
   node: node1
   interfaceName: eth0

This host endpoint has the label role=ingress that we can use to apply a GlobalNetworkPolicy for our service.

Now let’s modify the service definition for bar and expose it using a well-known nodeport so that external cluster traffic, such as a browser, could access it:

kind: Service
apiVersion: v1
metadata:
 name: bar
spec:
 selector:
   app: baz
   component: bar
 ports:
 - name: http
   protocol: TCP
   port: 80
   targetPort: http
   nodePort: 30010
 type: NodePort

Our service is now exposed on port 30010 on all cluster nodes.

Finally, let’s suppose we want to limit traffic to users in our intranet (10.0.0.0/8). We do this applying the following Global Network Policy:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
 name: bar-external-consumer
spec:
 order: 10
 preDNAT: true
 applyOnForward: true
 ingress:
 - action: Allow
   source:
     nets: [10.0.0.0/8]
 selector: has(host-endpoint)
---
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
 name: drop-other-ingress
spec:
 order: 20
 preDNAT: true
 applyOnForward: true
 ingress:
 - action: Deny
 selector: has(host-endpoint)

For preDNAT traffic, we must also set applyOnForward to true to ensure that normal policies apply.

This policy allows any traffic to nodeports from the 10.0.0.0/8 address range. It can be restricted further using additional filters on the destination; however, take care not to block traffic to other essential endpoints such as the API server.

Extending Trust

To further limit exposure, we can use the same mechanisms discussed above to also protect the Kubernetes control plane (watch for a future blog post). Project Calico thus allows us to extend protection to every layer of our system using a consistent network policy framework.

Want more details? For a deeper dive on using host endpoints, check out our documentation on Securing Host Endpoints.