Why Can’t We Be Friends? – Kubernetes in a Zone-Based Architecture World

Since practically the beginning of data networks, Network and Security professionals have gravitated towards, and grown to love, Zone-Based network architectures.

However, with the evolving landscape driven by microservices, containers, and Kubernetes, Zone-Based designs are being challenged to keep networks secure without creating an unreasonable amount of continuous configuration changes to firewalls.

With this challenge, comes the opportunity to rethink how network security can be delivered more effectively and efficiently. The Cloud and Kubernetes offer a ton of flexibility but how do we achieve security, visibility, and compliance in these new areas.

This technical webinar will dive into how Tigera can help us answer these challenges and more in the cloud landscape.

Complete Transcript

Michael: Hello everybody and welcome to today’s webinar, Why Can’t We Be Friends? Kubernetes in a Zone Based Architecture World. Michael: So before I hand over the webinar to our presenter, I have a few housekeeping items to go through with you and about the presentation, questions and the webinar platform in general. So first off, today’s webinar will be available on demand pretty much immediately after the live session is over. It takes about typically 15 to 20 minutes. So if you want to go back and look at it again, you can. Michael: Also we’d really love to hear from you today. Questions are are very important to us. It makes a much better presentation if we have a dialogue back and forth. We’re going to have two Q&A sessions. Our presenter Eddie is going to go through some of the presentation and right before a demo we’ll stop and we will take questions. So if you have questions along the way, just type them in as you feel them. And then when we get to the Q&A session, we’ll address them. Michael: He’s then going to do a demo and then we will do some more slides. Then we’ll have a Q&A at the end where we can take any questions we missed the first time and answer any new questions. Michael: So that being said, let’s go ahead and talk about our presenter. So I would like to introduce to you your speaker Eddie Esquivel. Eddie is a senior solutions engineer here at Tigera. He has worked in technology for the last 20 years, first actually as developer. Was that in the banking industry Eddie? Did I remember that correctly? Eddie Esquivel: Yeah, that’s correct. Michael: And over the last few years he’s been focused in the Kubernetes space with experience at CoreOS and now here with us at Tigera. In his spare time he enjoys exploring New York City with his wife. Michael: So without further ado, I will hand the presentation over to Eddie and you’re in his capable hands and we’ll have a great presentation. So enjoy. Eddie Esquivel: Great, thanks Michael for that intro. So yeah, let’s go ahead and get started with the presentation here. Eddie Esquivel: The first thing I wanted to discuss was this notion of a zone based architecture. It’s essentially, a zone based architecture as we know it in the data center, well it’s going to have a few zones. There’s going to be a zone that describes the front end apps and maybe you have access to the internet. There may be a middle tier zone where you just get kind of the business logic. Then there’s the treasure that Keenan put a speaker in the restricted zone. These are generally the best practices for a data center today. And it gives you some definite benefits. Eddie Esquivel:For example, it allows you to establish boundaries between the application, allows you to detect any sort of traffic that goes between the firewalls. You’re going to be able to have visibility into that. And last but not least, you’re going to be able to limit the fire, the blast radius if there is going to be a breach of any sort. So a fairly well understood network best practices in the data center, as highlighted here by this graphic. So when we talk about zone visibility, the zone visibility really comes by virtue of the firewall. So you’re able to see the [crosstalk 00:03:38] zone visibility. We’ll see people go ahead and try to hairpin traffic, all traffic, back to the firewall. With the migration to the cloud, we’re starting to see people perhaps try to rethink different aspects of the application, how they’ve architected it, maybe going from a monolithic application to microservices. But one of the interesting things is that by and large the network paradigm or architecture that we’ve been used to, on prem, hasn’t changed all that much. So for example, if I take a look at some screenshots here from various cloud provider documentations, if we take a look at this AWS screenshot, similar to zone based architecture. You got VPCs for the front end, you’ve got blue green deployment, and then VPCs that kind of firewall off any sort of traffic. Likewise for Azure, very similar network architecture diagram here. And even Google is getting into the game with zone based architecture. So this is what typically people are trying to do in the cloud. And again, as was mentioned earlier, there are some key benefits, not the least of which is containment. And some type of visibility in terms of interactions between the network zone. So guess what? We’re obviously going to discuss Kubernetes today. Kubernetes kind of puts a huge gap in that architecture, not the least of which is the fact that we’re dealing with dynamic IPs. So trying to create firewall rules where you have at a basic level one static IP communicating to another static IP, that paradigm isn’t really going to fly in a Kubernetes world. Again, dynamic IPs just do not lend themselves to what we’ve traditionally been doing on prem, and even to some degree in the cloud. There’s little visibility within the zones of a Kubernetes cluster. If you really want it to leverage firewall zones, you’re going to have to plug or punch huge wide IQ ranges in the firewall to these different network zones. You’re still going to be limited by the visibility that you may or may not get within the zone. So firewalls by and large don’t really apply to Kubernetes. Another key issue that we’ll see is, for example, if you need to have access to an external service, maybe AWS RDS, maybe it’s a snowflake database or something, it really becomes an all or nothing proposition. All your pods in your community are supposed to have access to it or none of them do. So firewalls are a bit tough, a bit challenging within the Kubernetes ecosystem. What we once had a in the traditional environment in terms of visibility really gets limited. So 14 policies between number zones get a bit hairy when you’re dealing with these dynamic IPs. You may get some logging and event correlation, but it’s certainly not going to be enough to satisfy even the most lax of CSOs out there. So is there a better approach? I think at this point in the conversation, I love highlighting and epiphany as I had when I read this book called The Grasslands back in college many years ago. The book was basically making a point that hey, we’re decimating our ecosystem in the Midwest, the grassland ecosystem. And that is due in large parts to all the cattle that we’ve placed on the grassland. The way a cow eats the grass, basically it pulls it out from the roots and it doesn’t allow the grass to regrow. And the book made this interesting observation. We killed the 100 million bison to put a 100 million cows in the grass. Why don’t we just rethink the problem and wonder if maybe the bison were tasty and we could eat bison instead of having to replace the two. It’s an interesting epiphany. So this similar epiphany I think is merited for the conversation we’re having here about Kubernetes. Okay, so we’re moving to Kubernetes. Do we really have to stick with a traditional firewall? Is there a better way? And the answer is obviously that there is. Between a combination of Calico, and this year there’s two projects that are very near and to the hearts of Tigera. I mean, we brought to bear Calico, the opensource project and we’re heavily involved in the Istio project. With these two technologies, open source technologies, we can achieve that full inter zone visibility. We can achieve layer three through layer seven security and we can obviously limit the blast radius should there be an intrusion within the cluster. The question we’ll often get is it one or the other? Do I have to do this deal only? Do I have to do Calico only? Can I do one or the other? The answer is both together are what really give us a strong sense of security. Layer three and four BGP IP routing, we can allow Calico to handle that. But then for the HTTP applications, the layer seven protocols, that’s where Istio really comes into the picture. And by virtue of the demo that we’re going to get into in a second here, we’re going to highlight how the two technologies compliment each other to give us that zero trust security. Before we dive into that, I did want to just give a primer to folks on how we achieve these types of zone based, or these types of network policies within Calico. And one of the key things we need to understand in Kubernetes is a notion of labels. Labels are key ingredients to how we can achieve this. And again, let’s also discuss the motivation for network policy. If we have end services within the cluster, you have ends where possible connections. As we know, only a fraction of those are actually necessary for the applications. So any of these paths that are used by organizations are merely useful for an attacker. So what we really want to start to do is tear down the connectivity within the cluster to something a lot sparser. So, for example, we know that the front end load balancers still communicate directly to the database. So that’s the line of communication that we would want to go ahead and restrict and eliminate from within a cluster. There’s a notion of namespace isolation whereas one namespace can communicate with another namespace. And there’s also finer grain isolation that we can achieve with Calico, even within a name space. So some example use cases. Not a lot of folks, but some folks will actually deploy dev and tests and prod instances together in the same cluster. Typically we’ll see dev and test in one cluster and maybe production in the separate cluster. Folks may have compliance issues or requirements really around PCIs, HIPAA and that’s really where you need some network policies to go ahead and prove to an auditor that you have these PCI zones in your Kubernetes cluster. So one of the key concepts in how we achieve this is labels. For most folks that have been using Kubernetes for any appreciable amount of time, we all know and leverage labels extensively. But for those that are perhaps to to Kubernetes, need to try and understand how policies work in this environment, labels are essentially a key value pair that is totally user defined that you can apply to just about any aspect of the Kubernetes deployment, whether it’s an application, it’s a new space itself. And there’s no limits to how many key value pairs you can create. The nice thing about labels is that you can actually reference [inaudible 00:12:14] by their labels and we’ll see how that plays an important role within network policy. When we’re using labels, we can go ahead and use equality based operators or we can do set membership. Or for example, we can create a query that says give me all the pods that equal this label and not this other label for example. And they really are a key part of the network policy. So when we go ahead and show you how to actually create the network policy here in a bit, we’re going to see this notion of pod selectors and also namespace label selectors. Let’s go ahead and take a look at a sample network policy. So first and foremost you’re going to notice the API version type network policy kind. It could be an incriminated construct. It could be a deployment. The people set, [inaudible 00:13:14] set. In this case we’re focusing on the network policy. We can see that it’s going to apply to, for example, a specific namespace. We can also create global network policies with Calico. When we go ahead and select or talk about the pod selector where we match labels, notice in this case we’re going to apply this policy to any pods that has the label roll colon database in that. So these blue pods, these blue pods are going to be the ones that actually have this policy applied to them. And what we’re basically going to do is we’re going to go ahead and allow ingress from any front end pods on a statistic quartz. And by creating a policy like this, we have a default deny [paradoxer 00:14:00] by virtue of a policy. So we’re going to eliminate any other traffic that can go ahead and hit this pod. Eddie Esquivel: Before we dive into the demo, I’d love to talk through a Q&A. Please feel free to enter any sort of questions in the chat and we can go ahead and handle them. We’ll allow about 30 seconds. There’s a lag between actually punching in the questions and myself seeing it. So we’ll give y’all a few seconds here to go ahead and ask questions, if you have any. Okay, let’s do it. I’m going to go ahead and share my screen and switch from the right tech deck to my screen. So by the end of the demo, we’re going to have a scenario where we have Calico by Richard Felix locking down layer two and layer three. Then also locking down Istio or layer seven application. Richard [inaudible 00:16:09] integration that we’ve built here at Tigera.What we’re going to start is with a very basic scenario, we have one single Kubernetes cluster. We have a couple of namespaces. We’re going to have several parts in each environment. And in the D4 Kubernetes deployment, this basically means that everything can talk to everything. So not ideal, but we’re going to see how we can go ahead and whittle this down to just having the connectivity that we actually want. And for good measure, I’m going to throw in an attacking pod in there and we’re going to leverage the favorite hacking tool, Cube Cuddle, to go ahead and actually execute some pings and see what we have access to within the cluster. So one of the first things we’re going to go ahead and do is drop namespace, more than nine namespace actually label based isolation. So what I mean is we’re going to try to replicate within each one of these namespaces the basic three tiered architecture that we’ve leveraged on premise. You may have a DMZ zone, the front end apps, the customer apps that are facing the internet, for example, whoever you have, and then the back end apps. Notice that the attacking pod actually was able to ingrain that engrain itself within this restricted zone. So we have a couple of namespaces and we’re going to create some policies and we’ll actually see what these policies look like. So let’s go ahead and take a look at that. So if we head over to a console, I have my demo scenario here. The first thing we’re going to do is we’re going to go in and exec into the attacking pod and we’re just going to see what we have access to. So from the attacking pod, that red pod in the demo slide, we’re going to notice that we have access to everything. Not ideal. So let’s go ahead and actually secure by means of network policies, Calico policies to be specific, this environment a bit more. So I created some policies. Notice the policies popped up here in Tigera secure. And if I go ahead and run the same commands one more time, we’re going to notice that we’ve gone ahead and limited the scope of what that attacking pod actually has access to. So we’re going to see some failed pings here. But let’s go ahead and take a look at what this policy actually looks like. So let’s go ahead and take a look at a policy here in Tigera secure. We’re going to notice similarly to what we described in the slides, we’re going to apply this to any pod that has a label firewall zone equals restricted. And we’re basically going to allow ingress only from a couple of various endpoints that are labeled with very specific labels. If we had to, we could also create egress policies within Tigera secure. So let’s go ahead and take a look at what these labels actually look like. So for example, if I take a look at the actual database Yammel, we’re going to notice that the database that I have running into my cluster, if I go to the top of the deployment restricter actually has these levels. So for example, this label is the key one that we’re leveraging within this policy. And by going ahead and setting up these policies, we’ve now achieved some better security. Basically we’ve gone from a wide open environment to this environment. So we still have access from the customer, the database can access customer pods if necessary because those are in the DMZ zone. We have access summary because that’s the tier that actually needs access. And the database can also retail to the front end since the front end is internet facing. So what if we wanted to go ahead and employ a specific policy to, all intents and purposes, punch a hole in this firewall? This would be essentially what a firewall change would look like. How would we do that? Are we going to leverage the static IPs? I mean, we know that that’s a nonstarter within Kubernetes. So what we’re going to do is we’re going to go ahead and leverage labels to go ahead and facilitate what would be a firewall zone. Then at this point, if I’m talking to a prospect, a customer, and we talk about how long it takes to create a firewall zone, I usually get eye rolls or laughs or some combination of the two because it is something that can take quite a bit of time in a lot of different shops out there. So let’s go ahead and facilitate this network zone policy here. So the first thing we’re going to go ahead and do is actually highlight that at the beginning we don’t have access to that database pod. So the bank info pod as described here does not have access to the database pod. So we’re going to go ahead and do our firewall change here, read any policy. Let’s go ahead and take a look at that policy. We notice that we’re now going to, again, apply it to the database and we’re going to allow any app that has the bank info level to have access to it. So went ahead and created that namespace or that policy. I’m going to run this again. and we noticed that immediately we now the bank info pod has access to the database pod. And one of the key features of Tigera secure is this notion of tiering where basically we can group policies, we can apply our back to these groups of policies and we can evaluate the policies in such a way that very important policies will get invalidated first and will never get over written by underlying policies. So for example, because this was the namespace [inaudible 00:21:57] policy, it may be a policy that your security team would want to enable. We allow the platform team its own tier here and anything the platform team doe in terms of policy would never override anything that the security team does. So this is basically policy tiering and how we are able to give different teams within your company different access to creating these policies. So it’s a key feature that really separates Tigera secure from even just open source Calico. Let’s keep moving. We still have this notion of the attacking pod. So still need to do something about the attacking pod. So we want to go ahead and prove or first simulate what this attacker actually has access to. Then the end result is that we actually want to be able to lock down that attacking pod. Let’s go ahead and do that. So if you go ahead and take a look at what that attacking pod has access to, we’ll noticed that that pod actually has curl HTTP access to that pod. So we are moments away from the usernames and passwords getting exfiltrated out into the world wide web. It’s not an ideal scenario. Let’s go ahead and lock this down. So we’re going to go ahead and create a policy, a network, to go ahead and lock up. The first thing we’re going to do is going to go ahead and toggle Istio. And toggling Istio basically means, we already have Istio control plane running in the namespace within a cluster. What we’re basically going to do is add a label to the bank namespace. What that does is it basically kicks the pods and injects a couple of containers into that pod. So we’re going to, first and foremost, anybody familiar with the Istio architecture knows that we inject the envoy pod. This is essentially the proxy that handles all the TOS and all of the cool every service [inaudible 00:23:59] features that Istio allows for. But in our situation, we’re also going to inject a second container, a container we refer to as [Kastis 00:24:09], built here at Tigera. This container basically pushes policy driven by Calico into envoy. So for example, if I go ahead and take a look at the pod that we’ve now kicked, we’ll notice that these pods, whereas they once had one container, they now have two containers. So Kastis and the envoy have been injected into these pods. And if you take a look at the labels for the namespaces, we notice that really the key thing here is that gel bank now has these two injections enabled. That still doesn’t mean that we fully eliminated the the attack vector. If I run this attack, we’ll notice that we still have access to the attack. The reason for that is this attacking pod was able to go ahead and pull the search off the box, inject it into themselves and be able to execute the curl. So anybody that’s worked with Kubernetes knows that fairly straight forward, it just pulled the pods and mount them in as volumes. And if you go ahead and take a look at the actual curl command, we’ll see how that is actually the case. So if I actually run this command, we’ll notice that we’re actually curling the database and we’re actually pulling the search and leveraging those to actually do the successful HTTP curl command. What we’ve done with Tigera is this notion of multifactor authentication. So certs are a key factor in determining trust. But the other thing, the other identity that we leverage within Tigera secure is the notion of a Kubernetes identity. So let’s go ahead and create a policy to actually start leveraging that Kubernetes policy. So I’m going to go ahead and create a policy here. If we actually take a look at the policy here, we’ll notice that before we were leveraging labels. Now we’re going to start leveraging a service account name and we’re going to get very specific and allow gets to very specific paths, HTTP paths. So what this basically does is if we run this command again, the deploy attack, we’ll notice that we’re now getting forward with three. And what happened is, okay, yeah, you may have lifted the search off the pods, hack pods. It is actually allowed to connect to us by virtue of that policy that we created and validating with Kubernetes that yes, this service account actually belongs to this pod that’s trying to execute this request. So again, the form of two factor authentication, if you will, where the search isn’t the only identity that’s really going to get through the front door. So I think for virtue of this demo we were able to highlight how Calico leverages or locks down layer two, layer three policy and how Istio allows a finer grained HTTP access in the cluster. And between the two, we’re able to get the full security within our cluster. So with that in mind, love to leave it open to questions if there’s any sort of questions. And also, I’ll hand it back to you Michael. Michael: Yeah, so while we’re waiting for the questions came in, as you mentioned, there is about a 30 second lag when you enter the question before we see it. So if you guys have any questions, please enter them to them now. And while you’re doing that, I’ll inform you about the upcoming webinars. Michael: So the webinar that was originally scheduled for today about Kubernetes helm and network security best practices at scale will actually be happening later this month on the 24th, Wednesday. We had, our presenter Brandon injured his hand and he was going to be doing so much demo work that he couldn’t type. So he couldn’t do the webinar today. But he’s healing and will be ready for the 24th. So do join us for that. That webinar is up and ready and you can register for that now. Michael: And also, many of you have attended these webinars in the past, you know that we promote this webinar heavily because it’s just a great webinar with AWS, Tigera and Atlassian. It’s an Atlassian case about how AWS, about how Atlassian moved to the cloud. There are applications in a secure manner using Tigera and AWS. It’s a really, the people from Atlassian are great and it’s a great presentation to watch. So do take a change if you can. Michael: So the question is what prevents the attacker from running these commands themselves? In other words, what is the security footprint for Tigera itself? Eddie Esquivel: Oh, okay, good question. So in terms of the identity when you go out into the search, the search account identity, in terms of being able to actually create the commands, I’m leveraging the admin com for Kubernetes. That’s generally not a best practice. You don’t hand out admin coms to all of your developers. You have that pretty locked down and you would implement some notion of are back within your cluster. And maybe tie that back to an [Aldap 00:30:23] or something like that. Eddie Esquivel: So that would also be in another aspect, another strengthening factor to your cluster. A great question though. Michael: Eddie, thank you for presenting. And everyone, thank you for attending. I hope you guys have a great rest of your Wednesday and we will see you in two weeks where we will be talking about Kubernetes helm and best practices at scale. Thank you. Eddie Esquivel: Great, thank you all.