5 Tips for Organizing Your Kubernetes Security Policies

During this webinar, you will hear the top mistakes to avoid when defining enterprise security policies for Kubernetes, learn 5 things you can do today to define a security taxonomy that will scale with you into the future, and ask questions during live Q&A with the industry experts in securing Kubernetes.

Complete Transcript

Andy: Our webinar, the topic for today is Five Tips for Organizing your Kubernetes Security Policies.

Cody: Thanks Andy. This is Cody McCain. I’m a senior solutions architect for Tigera and I’m based out of New York City. And today, I want to share some learnings that we’ve had about how to use labels and such for organizing, not only organizing our policy, but also aligning it to the right security organizations, and for simplifying the overall process of assigning security policies.

Cody: A little bit about myself. If you are ever in the New York area or want to meet up with me, hit me up on Twitter at @McCodeMan, I’d love to exchange some conversation with you and learn a little bit more about what projects that you’re working on, and any way that I can help you apply the concepts that we’re going to be talking about today. So, the way that this webinar typically works, I’m going to go through a short presentation. At the end of the presentation, we’re going to have a Q&A panel and we will definitely hang around to answer all the questions. Typically, the presentation again is 20 to 30 minutes and then we typically have anywhere from a 10 to 15 minutes of questions, so hopefully you all can stay around for that. If you use the interface, the zoom interface, you can actually submit your questions in writing, and those will queue up and we’ll address those in the order that we receive them. So, without further ado, let’s get started today.

Cody: So, the first thing I want to talk about is this magical term called DevOps. We’re all using it. A lot of us are actually participating and practicing it. And typically, the definition that we use is this set of practices to reduce the time between committing a change to a system, and the change being placed in the normal production. while ensuring high quality. And I think when it comes to network policy, and configuration of that, the really key thing for us here is, to reduce the time. Many of us are coming from legacy systems where we used firewalls and had a centralized concept, a centralized group responsible for making any changes to those firewall rules. And that works well when you’re dealing with smaller monolithic applications, and maybe things that aren’t scaling and undergoing the dynamic changes that we’re now seeing with containers, and with a container orchestration, in particular.

Cody: And so, I want to use today to talk about how are we actually use the metadata associated with these dynamic workloads to hopefully speed up that process, and actually apply it, not only to the way that you develop software, the way you QA it, and the way you run the software, but actually how you secure it, as well. A lot of times, what ends up happening to our security teams, these DevOps operations have some great ideas. So, I’m going to make everything B code as an example, and we’re gonna use CICB pipelines. We’re going to run quickly, we’re going to iterate often. We’re going to have many deployments a day. Unfortunately, that works great in a playground, but at the end of the day, there’s somebody responsible for maintaining the compliance within an organization. And too often, the security team or the compliance team is left having to clean up after these other teams are running too quickly.

Cody: And what we really want to do, is we want to involve that security team into the overall a DevOps team and actually call it DevSecOps. It’s important for everybody to be responsible for security as we decompose our monolithic applications and we moved to these dynamic orchestration platforms, the only way that that actually scales, and the only way that we can do that efficiently, is that everybody has to have a hand and a role in securing our systems. And so, the first tip that I want to give in terms of taxonomy is, when you’re thinking about taxonomy and, and how you actually want to describe your workloads, all of the Metadata, the labels, that will describe your workloads, it is paramount that you involve the security team in that process, for a couple of reasons. One is you don’t want a separate set of metadata only for security, while all the other metadata governs the operations, and the development, and the QA.

Cody: It’s a lot easier for our developers if we have one set of metadata. And so if we can get the security teams to think about, and the overall a DevSecOps process, to shift the security requirements all the way to the left of our design life cycles, we can get the security team involved in making a good decisions about what types of labels are we going to actually need to describe the workloads that are going to deploy in this orchestration system. So again, tip number one, let’s involve security in creating labels and creating the metadata that’s associated with our workloads.

Cody: Tip number two, it’s to find balance between trying to get it right the first time, and trying to be agile with something like Metadata. The thing that we see most often is that you won’t get it right the first time, but every time you have to change metadata, since this metadata that we’re talking about, these are the labels for our kubernetes pods. These are the other metadata, like what namespaces I’m putting the workloads in. The service accounts that I’m associating them to. All of these things have a very large impact and a large surface area, when you think about the dependencies of other things. Or if you think about how they are dependent on by other things. So, for example, in both Project Calico and High Gear Us Cure, we use this metadata to actually bind security policies to our workloads.

Cody: And so if we’re constantly changing that Metadata, not only is it impacting the metadata itself, it’s impacting the way that security policies bind to those and we may be also using that metadata for other things such as reporting and auditing, logs, etc., and even for a triggering various behaviors within things like our CICD pipelines and other operators that we may have installed in our orchestrated system. So, it’s important that we try to get all of the players involved, again, the security team, the dev team, the ops team, all of the players, the QA team, that they’re going to need to help in making those decisions and try to get it as close as we can to the model that we want the first time. So again, tip number two, try to get it close the first time because every time you iterate it’s going to be difficult.

Cody: The other thing we see often is, we fight about whose role is it to define the taxonomy within a company. Who owns the actual metadata, the hierarchy of the terms that we’re going to use to describe our workload? The key thing that we want to look at from a model perspective is, there’s really two different models that this can be done with. On the left side, I’ve got what’s called Folksonomy. And typically this is a crowdsourced taxonomy. Anytime somebody wants to add a new label, they just add the label, there no restrictions, and we use that label however we see fit. The problem with the folksonomy is, this is too open ended. It’s the extreme that there is no commonality, and nobody that is actually managing the overall state of things.

Cody: Even things like the Wikipedia where we have the capability for anybody to submit in knowledge articles to create this crowdsourced encyclopedia, there’s still moderators that need to go through and, at least, maintain some semblance of organization and commonality between the way that things are arranged. So, to one extreme you have the folksonomy. To the other extreme, we have the very rigid, top down taxonomy. So, this is where we have one group may be responsible for creating all of the metadata identifiers that are going to be used within a system. The problem with going to the other extreme is that now all we’ve done is we’ve shifted all of the agility that we’ve tried to gain from using an orchestrated system, container orchestration system, and trying to delegate control of parts of our policy to our developers, we’ve shifted all of that agility now, into the creation kind of labels, because at the beginning, especially, we won’t have all of the labels that are needed to adequately describe both the compliance needs of our system, and the other authorization needs that will align those policies to various organizational policies.

Cody: So, what we really want to try to find is, we need to strike a balance between both folksonomy and taxonomy. There’s another side effect, too, of having too rigid of a taxonomy. Sometimes we see that if I’ve only got a limited set of labels that I can apply to workloads, and the process for adding new labels or you’re changing labels is extremely rigid, we’ll find that we have developers and users of our system will start abusing certain labels and they will use them for things that they were never intended for.

Cody: So then that takes away the readability of our taxonomy and we should be able to look at labels and other metadata and just understand from reading the label itself, what its intended purposes and, and maybe make some inference about the underlying workload, maybe the policies that are going to apply to that workload, et cetera. So when people start abusing labels, because there’s only a limited set of them, that’s the other extreme. So again, tip three, we want to strike a balance between folksonomy and taxonomy. You want to ensure that whatever process you establish for establishing new metadata, that number one, anybody can contribute to it. And number two, the entire turnaround from when I submit a change request or an addition, that at that piece of metadata is either going to get denied quickly or, or added quickly, and, and I don’t introduce additional bureaucracy to slow down the overall agility that I’m trying to increase.

Cody: So, let’s get into some of the core drivers, and how we actually go about defining the taxonomy that we want to use. So, what’s really driving the move to container orchestration is this concept of microservices. We’ve taken our monolithic applications, we’ve now decompose them into smaller applications that are using the network as the mechanism to exchange data between our components. So now, instead of our components being in process, our components are now using the network to exchange data, and at Tigera, our concern is we want to make sure that the exchanges between those components are secure and are what we intended.

Cody: And so, we have a product and a commercial and open source project that both focus on being able to write network policies that will ensure that the interactions between those components are what we intended. And to bind the policies that we write we use metadata that comes along with various orchestrators. So, for example, in Kubernetes, we have the concept of labels. We have the concept of namespaces. We also introduce the concept of a tier for policies, and all these pieces of metadata allow us to organize and select how our policies are going to bind to the various workloads. As we decompose these monoliths, that also means that we should probably be decomposing our policies. No longer should policies be focused on an application. I’m not writing a policy for an application. I’m really writing a policy for a certain behavior, or a certain type of interaction, that I’m going to see between components.

Cody: The reason that’s important is, in a microservice world, any one of the components of my application can be scaled. And so, I want to write policies in such a way that when I bind them, I bind them to various conversations because my components can talk to, it’s a one-to-n graph between one micro service and it can talk to a lot of different micro services. And so when we write policies, we may have the same behavior between multiple micro services. So, I’m going to talk about that a little bit more in depth when we actually start looking at what makes a good micro policy, but for now just know that we need to decompose them.

Cody: So, for tip four, now that we know we need to decompose policies, what is a good micro policy? Number one, I think that good micro policies come in pairs, and those pairs are going to be signified by the labels that we attach. Let me give you a good example for that. Let’s assume that we have a service that provides LDAP … or we have a micro service or other server that provides LDAP services to our application. And we have multiple components that need to consume that information. That you need to interact as a client to that LDAP server. A good way to structure my policy, would be to have an LDAP server or service policy and an LDAP client policy. If I want to look at the way that I would structured my labels, I could have a single label in Kubernetes called LDAP and the value of that label then, could be service or client.

Cody: And then I would attach that label based on the role of the microservices that are going to use it. The other reason that’s important is, when I actually write my policy, I can write my policy in terms of that label. For example, if I’m writing my policy for the LDAP server and I talk about accepting inbound connections, I’m only going to accept inbound connections from other workloads that have the label LDAP Client. Conversely, when I’m writing the policy for my LDAP clients, I’m going to create a rule that says I can make egress connections to any workload that has the label LDAP Service. Now you see how those basically come in a producer and consumer pair.

Cody: The nice thing about doing that, is now I’ve separated my concerns. I can, if my concerns are actually orthogonal, and loosely coupled in this way, this allows me then, to compose various policies and give various behaviors to workloads, simply by attaching the right labels. So, that workload that was an LDAP client may also need to talk to the mail server, so it would be a mail client. It may also need to talk to DNS, it would be a DNS client, et cetera. And so, now I can describe the capabilities and the behaviors of a workload simply by using labels. So, tip four, use micro policies because they’re much simpler to debug, because I’m only dealing with one concern, and I can compose them in much more complex scenarios.

Cody: Here we show some additional examples of using those micro policies. So, in this example, suppose that I have a set of workloads that are running in the US and another set of workloads that are running in the EU. And for all intents and purposes, the workload’s application architecture is identical in both places. We have a product page that needs to talk to a details service, and maybe a review service. And maybe the review service then, has another call down to a rating service, and that’s gonna be the same for both of US and the EU. But, I needed to shard my data, because the data that it was dealing with had data sovereignty rules attached to it and it had to be hosted in data centers, local, to either the US or the EU.

Cody: So, in this case, I have separated my concerns. I have a label for the geography, and I have a label for the behavior, which is called role. And what that allows us to do then, is to say that, from an application perspective, I can define the policies, the technical policies, about how my application is constructed. And this is something that the application developers would be more in tune with. They’re going to know the ports and the protocols, and the way the actual actual application is structured. Where my compliance team is particularly interested in ensuring that, the EU data stays in the EU and the US data stays in the US. And so, I can structure my policies to bind two different roles so I could have a policy that basically limits any communication between workloads, that aren’t labeled with the same geographical label, the geo label.

Cody: And then my developers would write the policy that basically enables the service graph to be constructed that actually completes the application. And so, on the left side, the way that we can look at that, is now I have aligned my labels to the organization and the responsibilities and the concerns of the organization. Since we know that labels attract policies, they also can carry along with them authorization. So, it’s important as you define labels, to also ensure that you’ve got a mechanism, whether it be a CICD pipeline, an admission controller, etc., that is restricting which people within the organization are allowed to assign labels, because those labels will attract policy and enable workloads to talk to various other workloads. So, for Tip five, aligning to the organization needs to ensure that the authorizations that are given to my functional groups, things like the InfoSec group, a platform team, et cetera, we need to ensure that those roofs can assign the sets of labels, and only those groups on the sets of labels, that will attract policies they govern.

Cody: Whereas we may delegate things like the application architecture and the policy that governs that to a different group. So what we’re really doing here, is we’re separating to some extent the compliance and the application architecture, and then at run time, the sets of labels, the aggregate set of labels, will all be combined to create an overall policy for a workload. One of the ways that we can look at this is, if I were thinking about relational versus technical, the compliance teams may be concerned about how do I quarantine workloads, or what are the cross cutting concerns in terms of PCI compliance, or other security concerns that are more global in nature or relational in nature between workloads. The data sovereignty example that we just used, and maybe even tendency your data classification can be part of that as well.

Cody: But as I move right, now I start getting into actual technical aspects of my application architecture and how the microservices are actually woven together. And so with our product, one of the solutions that we offer, is the capability to delegate those roles appropriately, and to say that our development teams, our line of businesses, et cetera that understand the technical nature of an application can be responsible for writing the pieces of policy that govern that, whereas our compliance teams can ensure that the right labels and policies can be attached that govern how an application limits it’s communication within itself so that it does maintain the right compliance. Things such as PCI compliant workloads probably shouldn’t be talking to non PCI compliant workloads, maybe EU workloads should be talking to US workloads.

Cody: So, that brings us to the end of our presentation. So just recap, we’ve discussed why labels are important. We’ve discussed we probably shouldn’t change labels often, that we need to include security at the beginning, so that they can help us come up with a set of labels that not only describe security but operational in dev and QA concerns, all at the same time. We’ve discussed, whether we should crowdsource those labels or whether we should be more rigid, and we just need to ensure that, whichever approach that we take, to strike a balance and ensure that the process for managing labels is agile. We’ve talked about what a micro policy is and how do we construct labels that allow that to be created. Things about labels coming in pairs, the producer consumer model, and also understanding, finally, how do we align those labels to the organization and ensuring that we have the right authorizations for assigning labels, and making sure that the right teams are writing the policies and creating the labels for the right part of the application. Whether it be more on the relational and compliance side or more on the technical side. So at this time I’m going to bring the presentation to a close and open the floor for any questions.