5 Ways to Detect Malicious Activity & Protect Your Kubernetes Workloads

Organizations are rapidly moving more and more mission-critical applications to Kubernetes and the cloud to reduce costs, achieve faster deployment times, and improve operational efficiencies. But security teams struggle to achieve a strong security posture with Kubernetes and cloud-based resources because of the inability to apply conventional security practices in the cloud environment.

Join Threat Intelligence Research Engineer, Manoj Ahuje, for this webinar where he will cover five different attack scenarios on cloud-based Kubernetes infrastructure, and how to mitigate these malicious activities at each stage with Calico Enterprise and GlobalAlerts, a new feature just released.

Complete Transcript

Hi, good morning everyone. So today we are going to look at the cloud infrastructure attack where attacker compromises multiple workloads in your cloud infrastructure and tries to move laterally. And we will be detecting those lateral movement with global alerts. At the same time we will look at malicious activities and many anomalies and how to detect them with global alerts. So let’s look at the today’s attack scenario and how our infrastructure is set up. So today we have our enterprise sites running in Kubernetes. And it is exposed to internet on port 80. It runs in the default main space. And it’s a Drupal application running on the default name space. At the right we have crown space where our critical business workload is running. It is Webmin and Drupal is kind of pulling data from Webmin to show it to our front-end users. So this is the setup we have right now and we will be considering our global alerts around these two workloads. So let’s look at them how. So first of all we want to talk about a current approach to policy. So most of the application in cloud environment in the categorizing, they end up being categorized and policy is usually applied to that particular category or that particular label. But imagine if the part is created and the particular label is missing on that. So what happens, it may have unlimited access into your cloud. So that’s how the policy gaps are created over the period of time. So imagine your business critical web application. So that sort of application, when it starts listening on code and someone connects to it, the case like that, this cannot be monitored. Also, when the business critical application starts communicating with someone, which is not intended through policy gap, I want to know that. So these are the cases we’re going to target in this presentation for lateral movements. So what we want to know here, from our set up, so we want to know when our Drupal application within the front end enterprise application, it talks to a result inside the cloud, apart from webmin, on port 10000. I’m sure I want to know that. We want to know if my critical business application Webmin starts talking or initiates connection. I want to be able to know that. And what happens if Webmin starts listening on a code and somebody connects to it? And being the demon, Webmin just should listen on port 10000. So I want to be able to know that. Second thing which is important in cloud environment is meta data and cluster API. If it is being accessed through your user space, that’s really suspicious and listen until your developer has specifically written some code to access that API. Those APIs won’t be access through POS or through your workloads. So if somebody is trying to access it, I want to know that. Also we will be considering our suspicious IP and domain name threats through our threat [inaudible 00:08:28] feature, where we can import the public feeds or you can have your own internal feeds. Those can be useful. And the next thing we will target is if there is any DOS or DDOS attacks, we will want to be able to know that. And this is a really easy configuration with global alert. And the policy monitoring is something really critical. You want to know when your cluster is running, if there is somebody who is … what happens in large enterprises is you have many admins and many roles. And someone who is not supposed to make changes gets the administration RFA account with the privileges and starts making changes. So I want to be able to know that if somebody in your cloud environment starts making the changes in global network policies or networks. So this is the thing we’ll be looking with global alerts. And last but not least, DGA traffic. So this is the top IOC. If your cluster sees DGA traffic, it’s almost like it’s compromised. And this is something we will be talking about in future. So in this demo, we won’t be showing that, but in future we will have a blog post for that. So with that we will start with our first to demo of lateral moments. So just give me a second here. So here, this is the attacking machine. And this is simply trying to see what the public enterprise site looks like. So being that attacker, I don’t care about the GUI. I’ll just go ahead and make a request into the application. And here I can see the headers HTTP 200 OK requests. And in that header, I can see the deployment which is running there is Drupal 8. So Drupal 8 is interesting. So if we do a little bit research on Drupal 8, we can see that the Drupal 8 has Form API injection. This injection works in a way, the Drupal has API where if we inject any input in the form, it accepts and stores it, but when there’s a call back, and that malicious input is an array. The values inside the array ends up executing. So that’s the one ability that we are targeting. And a little bit off the research that I discovered, there is a Metasploit module available as well. So we will use that module to do further attacks. So let’s look at the attack. So here, attack is just firing up the console and I’m going to set the Metasploit parameter to just Drupal sites. We’re targeting Drupal sites. And we will try to exploit it. So let’s exploit it. Okay. So we see the session running on Drupal side. Yeah. We will go ahead and interact with this session. And I will drop to shell and I’ll see that the user I am is with UID 33. This isn’t what I’m looking for, but I’m happy with any kind of access. I would like a route, but it’s fine. And I’m running as application process, which is www-data. And I think the most interesting site to look at is the service token here. So we will look at what we can do with service token later. But at the same time here, here is the service token. So attack is successful on Drupal. Now we will see how did we do on detection part? So thinking about this manifest [inaudible 00:14:35], they run once in 10 minutes. So they will look at the window for 10 minutes and they will run once in 10 minutes. So I’ll show that in conservation briefly, but just here, we will just execute it right away, because we don’t want to wait for 10 minutes for now and let’s fire up and the manifest is success- PART 1 OF 3 ENDS [00:15:04] Fired up and the manifest is successful. So let’s look at the alert. So we see a lateral movement detected alert, which says the outgoing connection from Drupal. So as soon as the Drupal initiated a connection to known host, the global alert came in and said, “Hey, your Drupal application has initiated connection which is not approved or authorized.” I think we had some trouble with our security. All right, looks Fine. So as you can see, the lateral… As you can see the global alert told us and we can go and further investigate this. So, that’s about the Drupal attack. Let’s go back to our slides. Yeah. So, this is the manifest we use to configure alerts. The interesting thing to look at is the period and look back. So both are set to 10 minutes. As I said, this manifest run every 10 minutes and they look at the data, which is T minus 10 minutes. So this is what it is. And the query is another interesting part to look at. You can be really creative here. And this is what it looks like for outgoing connections from Drupal. The same way we have queries configured on Webmin, if Webmin initiates any connections as being the servers, we are not expecting any connections being initiated from Webmin. But if it does, we will see an alert for that. And the same thing for Webmin is if this is my business critical application, so if it access, if it starts listening on any ports, so I’m going to know using this particular manifest. So these are my manifest, and this is the alert we saw for the Drupal. So we want to move to next phase of attack. So let’s look at my screen. Okay. So here we have session running already, that we compromise Drupal and we would want to know what is there, which is accessible inside the cluster. So there might be a lot of resources which are running in your cluster, in any cluster. So basically it would be do pings with the network or do the port scanning. So here I’m going to use the ping-sweep method and I will sweep the size 16 network. It takes a lot of time to sweep it but listen, attacker is in no hurry. He can wait for months in there. So we will sweep the network. And here we a found few holes. So those holes, after you know, probing to each one of it, attacker finds out that there is a Webmin service which is available on port 10,000 and now he wants to probe it further and see what’s going on there. So we go to our Drupal station again and we drop to shell so that we can query our Webmin port, which is accessible from Drupal. And notice that Webmin is not accessible through internet. This is totally your business critical application. So we’re going to go ahead and make a request to Webmin and see what’s going on there. So here we make a request and we see lot of data. So this is clear to me. This is SSL and I see 200 okay response. That’s good. And I see the server header as well. So this is MiniServ 1.910. So let’s see what is MiniServ 1.910 and we, little bit of research on 1.910, it kind of comes to us that there is a problem with this version. So this version has a backdoor in it’s code and this is the second time it has happened for MiniServ that attacker was able to successfully put backdoor directly into the code. So this is interesting and backdoor is simple. When you go and change your password, you put your new password and then your old password. So whatever you put inside your old password, it gets executed. So as simple as that. And we have modeling for that and we will use it to exploit. So let’s look at it. Okay, so here we go. So here now attacker opens another shell to the Drupal and we need two shells here because we need one shell to just listen for reversal if we successfully compromise Webmin and another shell here, we want to be able to send or exploit to our Webmin. So we have two shells open here. We will go ahead and drop into one of the shells. And I already put some binary’s there, which we can upload to Drupal. We have access to that. So use one of the binary to listen on particular port. 2222. So this is what I did. I went in and now started listening on port 2222. On second side, this is a little bit interesting. It gets tricky in cloud but your destination is not reachable to attacker. So I’m going to create a relay. So what the relay will do is any traffic it gets from my attacker machine it’s going to forward to Webmin. And this is as simple as one command. So on my machine, if I send anything on for 5,000 it’s going to go to Webmin and put 10,000 and this is being done through Drupal. So we set that up and now we’ll go back to our console and we’ll exploit Drupal by setting the proper parameters. So this is console and we set up the parameters and we will try to exploit it. Let’s give it a try. And yay. As you can see on right side, you’ve got a console running and it’s UID zero. So we got a root access on Drupal. And we will look at, if anything else we can look at is always the default token. So we will look at that. Yeah. So we compromised the Webmin service now and we have access to the port, root access to the port. Here, the attacker can really use this as a mining port because he has the privileges or it can be, he can use it as a … From that key port to launch attack into your cluster. So it depends on attacker. But right now we’ll see how we did on detecting this attack and we will just go and quickly execute our manifest because we don’t want to wait for 10 minutes right now. And as soon as we execute, let’s see. And yes, we got a alert. We will go to our eloquent price alerts UI and have a look at that. So we see the top one, outgoing connection from Webmin. And this is the connection we are looking where the Webmin, the workload has kind of initiated the connection while we were monitoring it and now we are seeing it. So we can investigate further to see what’s going on there. So this is about Webmin. Let’s move on to our slides. So as you can see the manifest we use there, this is the alert we already saw. So now we come to the cloud API access. So this gets interesting as I already talked about it. Whenever you see a probing to a cloud API as data center is completely API driven, you would want to know that. So we will look at the quick demo here. So here I already have our Drupal shells. So we’ll just go into this and use our binary. We will just, we would want [kubectyl 00:26:26] binary here. So we will go and download that binary and see what we can do with it. So interesting thing about kubectyl is it automatically detects the service tokens that we have, default service tokens and everything it needs to run. So we don’t need to do any extra configuration. If you have that binary and running through ledgers, we can do that and we can run the kubectyl to know what are the privileges our token has. So we will go ahead and run that. And this is a simple command and it tells us what are the privileges this token has. So from the look, we don’t have any privileges anywhere, update or patch privileges. So only a self subject two, that is nothing. So basically the default token doesn’t have any interesting privileges. Though the token doesn’t have interesting privileges, attacker tried to you know, access the cloud API. So we’re going to go and execute our manifest, which is monitoring the cloud API. And we will see it run successfully and we go to our enterprise secure alerts page and let’s see how the alert looks like. So this is the alert we got that somebody from your user space, he doesn’t have any business to be querying API. And these are all the details you get. This is the metadata API access. Whenever you try to query it, you can monitor your metadata APIs, like it’s D&R back. So we can monitor any cloud API with this. So back to our slides and we saw this is the manifest which was triggered. So interesting fields to look at again is period and lookback and the query itself. So sometimes it looks complex, but it’s not. You can be really creative with these. So we’re going to, you can look at our doc post, which we did on this, so you’ll get a better idea on configurations. And this is the alert we saw. And now the Suspicious IP and Domains. So this is really a quick demo. So let’s look at our Suspicious IP and Domain feeds. So here we go again, back to the station on Drupal. We’re going to go and just interact with it. And what I’m going to do is, while I’m PART 2 OF 3 ENDS [00:30:04] What I’m going to do, is while I’m on the share, I’m just going to pick a bunch of IPs from suspicious IPs and domains, and I’m going to talk to them. Here it is. I just started talking, and as soon as I start talking to this bunch of suspicious domains or IPs, you can see that I didn’t use any of this in my attack. [inaudible 00:30:39] probably, or one of his [inaudible 00:00:44], ends up talking to these. You will get an alert if you have this configured suspicious IP and domain. We will look at the alerts here. You don’t need to run any job as soon as you make a query suspicious IP. As soon as your workload tries to talk to any suspicious IP or domain, you will see the alert by default. Let’s see how it looks like. This looks like the workload has tried to talk to a domain. This is how it will look like for a domain, and we see it is wcrowd.com. If workload tries to talk to an IP… This is the alert for the IP. All right. Back to our slides. As you see, “I didn’t use…” We already talked about this. This is the manifest that we use. The suspicious domains and IP is really easy. We have full methods, you just need to specify what are your feeds you want to configure here, and we will be configuring it for you. The same thing for IPs. This is the alert we saw. DOS and DDOS. If you are hosting a really sensitive service, which can be DOSed, or you want to monitor the DOS and DDOS, it’s really easy. We can detect that TCP, UDP and DNS-based DOS. How we will detect it, let’s look at it quickly. Let me go back to my screen. Okay, here we go. This time, I’m an attacker. I’m just going to go and grab this particular… My resolvers, and just make 1500 queries. In a very short time, as soon as I’m done with it, I should go to, alerts page. This is interesting. In 10 minutes time, as we configured our manifest to be running in 10 minutes. It aggregates all the DNS requested, gets all in the period of 10 minutes. If you want it to be shorter, you can make it shorter, in five minutes. It depends. The number depends on your cluster. The threshold depends on your cluster. If you have 100 workloads, the number will be different. If you have 1,000 workloads, the number will be different. If you have only 10 workloads, the number will be different. That’s why I kept it open-ended and took a really small number. The number for me was 1,500. We see alert, so if somebody is trying to DOS you, you’ll know immediately. You can take straight away action of blocking the DOS. Back to our slides. This is the configuration. As we know, we can configure it to be TCP, UDP or DNS based. However, you won’t. One of the interesting things is policy monitoring, because in policy monitoring, this is critical. Your policy is everything in API driven data center. If you are really paranoid like me, I would want to know each and every part which is being created, and who is creating them. If not, then I would trust my admins, and I will definitely say, “These are the admins which are authorized to make changes.” If, apart from them, anybody in large enterprise gets privileges, or users privileges, and starts patching, or deleting, or updating the part, I want to be able to know that. This is the first one. We will create a global alert configuration around this. Second, is I want to know if my global network policies… If somebody’s trying to create, update, delete, or patch them, trying to create a hole, or anything apart from my authorized service accounts or admins, then I want to be able to know that. Same with the global network sets. These are the things we can easily configure with the global alerts, and monitor for your policy changes. To do that, the manifest’s are really simple. We can make use of it. This is a policy monitoring. I don’t have a demo for it, but this is really, really straightforward. Thank you very much. That’s all we had. With the monitoring spectrum, the features we have and the monitoring spectrum we have, and with the context inside the cluster and outside the cluster, the category enterprise is in unique position to hang the tread. This is a threatener’s paradise, I would say. Yeah. You would like to test it out. Thank you for coming, and goodbye.