In the previous post, I introduced the concept that a cloud-native or microservice application environment is somewhat different than the environments that preceded it in some fairly fundamental ways.
In this post, I’m going to explore one of these platforms to give you a bit of grounding into what these platforms do and at a very high level, how they do it. There are a number platforms in the container management space such as Docker’s Swarm and Mesosphere’s DC/OS, that do many of the same things, with the same concepts, but with significant differences between them. Kubernetes, however, seems to have the greatest adoption, and therefore, that is the one I will use as an example. Kubernetes is an open source project that originated at Google and is based in part on Google’s own internal infrastructure that it has been using for over a decade, called The Borg.
A full discussion about Kubernetes is beyond the scope of this blog post, but you can read a great write-up about the concepts involved for more details. That said, here are some of the high points:
Kubernetes is based on “everything as code”
Modern applications are assembled from modular bits of code that are religiously curated from code repositories, put through an automated test and deployment CI/CD model, and assembled just-in-time with ephemeral run-time components and connections. This JIT assembly, along with agile development patterns, is enabling even legacy businesses to become software enabled, or even software driven.
While this approach has driven some radical changes in software development, it has also introduced an impedance mismatch between today’s software development process and the infrastructure that hosts the artifacts, or output, of that process.
The end goal of increased release speed required for a software-enabled business cannot be met if an application stack can be updated and scaled multiple times a day, but it takes days, weeks, or even months, to make the corresponding changes to the supporting infrastructure. All you have achieved is to kick the can down the road. The end result is unchanged.
This is not a new problem: platforms such as Kubernetes address “everything as code” by defining the infrastructure and its relationship with the software and applications that it supports via very well defined, modular interfaces that can be manipulated as code. If you want to:
- Define how a given microservice is to be launched and life-cycled – write a bit of code.
- Define a service to expose that microservice internally and/or externally – write a bit of code.
- Define the storage, networking required for that microservice – write a bit of code.
- Instantiate a load balancer for that service – write a bit of code.
- Change the version of the microservice that is deployed – write a bit of code.
These code fragments are simple declarations of your intent of how the infrastructure should interact with your code and services. Most importantly, they are life-cycled just like the rest of your code and probably alongside your code. They are checked into a source code repository like git, are deployed and validated through the same CI/CD pipeline, etc.
Since the code defining the microservice or application, and the code defining the interaction between the code/microservice and the infrastructure are developed, managed, and life-cycled concurrently, the evolution of the infrastructure tracks the evolution of your code base.
Similarly, the same mechanisms you use to control the deployment of your code such as peer-reviews, automated testing, and others, can also be applied to your infrastructure. This can be vastly superior to manual intervention or e-mailed trouble ticket requests.
This concept is widely referred to as “everything as code” and underpins both the microservice revolution, and the infrastructure that supports that revolution. Thankfully – Kubernetes is based on “everything as code.”
Kubernetes is based on the concept that everything rendered is immutable
This is one of the key differences between the previous generations of platforms and the cloud-native model. In previous platforms, each instance of an application or server was a specific “pet.” VMs were managed no differently than physical servers. New versions of software would be loaded onto the VM and updated. If security vulnerabilities were discovered, those VMs would be patched, very often manually, one by one. Operational issues would be resolved by operations teams logging into the affected servers or VMs to make the necessary adjustments.
The problem with this model is that each server (physical or virtual) becomes its own entity. There is no commonality. This approach, by its very nature, limits scale. There is no rational way in this model, to deal with potentially tens or hundreds of thousands of containers, each slightly unique.
If any change needs to be made to either the infrastructure or the code that runs on that infrastructure, it is updated and versioned via the CI/CD mechanism discussed above. The resultant new, updated versions of the infrastructure and/or code is deployed, and older versions drained and shut down. Therefore, Kubernetes drives an assumption of immutability.
This means that there is no need for anyone to “log in” to infrastructure components or containers / microservices. There is no reason to change running platforms. Any changes are then easily traceable and replicable via the CI/CD infrastructure.
Kubernetes is based on a declarative model
In Kubernetes, as in most cloud-native systems, the definition of the services and the infrastructure is declarative in nature. Instead of configuring the system based on what is currently running in the system, such as firewall or load balancing configurations based on specific IP addresses of current workloads, those instructions are based on declarations of what the system should do in a given situation. The declarations instead define the behavior of the system, and it is up to the system to evaluate the existing environment, and compare it to the desired state, as defined by the declarative statements that are injected into Kubernetes, usually by the CI/CD environment.
This allows for the system to be configured a priori, and then, as applications or services are deployed or modified, those declarations will be adhered to, and the system will self-adjust and continue to adhere to those declarations.
Kubernetes is based on maximal distribution
Earlier versions of orchestration systems were very centralized, with a single, potentially logical, controller making all the global decisions about all application lifecycle actions. This worked at a small to moderate scale. However, at large scale, and at high rates of change, this will not scale.
Cloud-native systems, on the other hand, distribute those control activities to the lowest level possible. In most cases, the only centralized decisions that are made at a controller level are system-wide decisions such as cluster-wide service proxying, and the selection of what nodes should host what workloads. The actual actions that instantiate the workloads, monitor them, and life-cycle them is made locally, on the node hosting the workload in question. This allows the system to scale as the size of the cluster increases and keeps decisions that are inherently ephemeral, such as the life-cycle of a specific workload, local to the node hosting that workload.
Metadata, the source of truth in Kubernetes
As I have discussed earlier, there are a huge number of moving parts in an at-scale microservice environment. There is just no way to uniquely refer to each and every one of those elements. Even things like IP addresses, which, in the past were fairly static, are much more ephemeral. One minute a given IP address may refer to a web front-end, the next minute that web front-end might be life-cycled out, and that IP address would then refer to a database or some other component. So, how do you “herd the kittens”?
Systems like Kubernetes do that by referring to metadata that is part of the code that defines the given service, infrastructure component, etc. In short, you attach labels or other metadata to components or applications and then write the declarative statements I mentioned earlier by referring to those same labels. A label can be applied to multiple components, and multiple labels can be attached to a single component.
This is a very flexible, automatable, and scalable mechanism to refer to all the components in a cluster.
An illustrative example
The current world
Let’s say that you have a web service that you want to expose to the rest of the world. That service will be handled by some number of actual server nodes that will need to scale with load. You will need to adjust firewall rules to allow traffic to the server nodes that are offering the service, and need to be dynamic due to scaling. You will also want to do blue/green or canary deployment testing before new versions go fully production. How would you do that in the legacy world?
- Define an IP address range and segment that would reserve enough space for the maximum number of servers that may be needed for scale and canary testing
- Define a set of firewall rules for that IP address range to allow HTTPS port access to those servers from the outside world
- Make sure that each server that is stood up for lifecycle management, scaling, canary deployments, etc, is hosted in the correct segment
- Make sure that each live server is referred to, by its IP address in the load balancer that is exposing the service. Note: this needs to change each time the server fleet for the service changes.
- When a new version is released for canary testing:
- the canary servers need to be separately identified to the load balancer so only, say 10% of the traffic goes to the canary servers. This is a special rule.
- If the canary or blue/green test is successful, then more of the new version servers need to be instantiated, and the old version retired, without interrupting service.
- The special canary rule in the load balancer needs to be removed.
- Create a DNS entry to make sure that the service is exposed via service discovery.
Most of these are manual steps, that are slow, also error-prone and do not scale with thousands of microservices and containers.
The new world
The same requirements as above.
- IP addresses are not used to identify workloads in a cloud-native environment, so I don’t need to reserve address ranges, segments, etc.
- The microservices that will ‘answer’ for the service will be identified by a label, let’s call it ‘web-front-end’. That label is attached to the definition of the microservice, and each instance of that microservice will be automatically labeled with that label.
- That definition will also describe how that microservice is to be scaled and health-checked. These rules will be automatically actioned by the orchestrator.
- The service will be defined with a name, let’s call it ‘webservice’.
- That service definition will say that any microservice with the label “web-front-end” will be able to answer requests that come in for “web service”.
- The load balancer which exposes the service will be updated with all of the healthy microservice instances that answer for the service automatically.
- DNS would automatically be updated with the ‘web service’ name in the service discovery system, etc.
- A network policy can be crafted that says that microservice instances labeled “web-front-end” are allowed to receive external traffic on port 443.
- Those rules would automatically be enforced on each instance of that microservice, no matter where in the infrastructure they are, or what their IP address is at that point in time.
- The microservice definition can also have a label that indicates which version of the code is to be used. That means that there could be two microservice definitions, one labeled “web-front-end” & “version1” and another that could be labeled “web-front-end” & “version2”.
- A change to the service definition for “web service” could identify that a canary deployment is desired that would put 10% of the load on version2 and the rest on version1.
- The load balancer would automatically be updated to enforce that traffic flow.
- Later changes to that rule (say after successful canary deployment) would instruct the orchestrator and load-balancer would cooperate to automatically scale the version 2 instances and deprecate the version 1 instances.
The key thing to realize here is that almost all of this is automated. What is changed is the declarative statements the identify what something (a microservice, a service definition, etc) is, via labels and rules that define how those labels are composed into a specific outcome (i.e. exposing a service).
The humans are freed to define “how the system is supposed to behave” and the system takes care of the repetitive tasks that are so often the source of misconfiguration errors and latency in meeting requirements. The system now reacts in real-time, rather than “trouble-ticket time.”
Interesting, so what do I do with this?
In the next installment, I will cover how these differences affect the infosec solutions that I currently use to protect our environments, for good or ill. In the installment after that, I will cover some of the strategies to accommodate those changes.
Stay tuned – the ride has only just begun.