A proper container security strategy involves evaluating all components in the system.
Thoughts from 35,000 feet
Two experiences in the last 24 hours have encouraged me to write this missive. At least it seems like a good idea while sitting in a sealed Aluminium tube at 35,000 feet.
The first was an animated conversation with a very respectable colleague with quite a bit of operational security experience, about which companies should be listed as “container security providers” in a particular survey.
The second was reading an article in a recent issue of Proceedings. The author made an interesting observation about the U.S. Navy. While I am not qualified to judge the accuracy of the observation with regards to the Navy, the comment, as the author states, is widely applicable in many (ok, maybe all) technically driven communities:
Although the tendency to see technical solutions to nontechnical[sic] problems is not unique to the U.S. Navy, it often results in an inversion of the proper relationship between maritime strategy and acquisitions. Too often, technological limitations drive shortsighted strategy: The United States decides what it wants to accomplish strategically based on what it can accomplish technically.
Now, how do I get from an argument with a colleague and a treatise on the U.S. Navy’s behavior with regards to strategy & acquisitions in a post-cold-war world, to a discussion about container security? Actually, I’m curious about that as well…
Guilds, Priests, and Mages
In the dim dark corners of IT history, each component in the IT infrastructure stack was technically complex, arcane, and fairly brittle. Similar to the days of the guilds we created priesthoods that owned the secrets of that particular technology. We call them silos or teams. You had the network priesthood, the storage guild, the code conjurers, the security mages, etc. And they defended their prerogatives, their favorite suppliers, and design patterns.
For eons, organizing this way has worked. The environment was static enough that if the code conjurers needed some new capability from the storage guild, they could request it via an arcane incantation called an RFE (request for enhancement), and in some number of months, the new capability would magically appear. That static pattern applied to each silo, including security. The accepted model was that every team would deliver on their RFE’s and as the penultimate act, the security mages would come in, sing their spells and clean up the mess. People said that “everyone is a security mage,” but we all know what the real facts on the ground were, don’t we?
This vertically oriented guild system starts breaking down as things become more dynamic and interconnected. When we started this journey, we had dedicated servers that existed for years each supporting specific applications, connected to dedicated networking hardware, pulling storage from dedicated SANs and/or NASs, all protected by static firewalls, anomaly detection infrastructure, etc. Then came virtualization – now physical servers need to broker networking and storage to the virtual servers that they hosted, with those virtual servers hosted on that server possibly belonging to different trust realms. On top of all this mixing, those VMs now lasted months, not years, and we had more of them. All of a sudden, the server union started doing networking and storage, the storage folks needed to do networking, etc. The vertical orientation came under stress, and as a result, often became even more insular. A side effect of this is that we became used to thinking of the strategic goals, such as protection of assets or flexible connectivity as being defined by what that particular authoritative guild (e.g. security or networking) could deliver based on the tactical capabilities of their particular supplier community.
Nowadays things are much more dynamic and interconnected as we move to containers, microservices, and overall cloud-native design patterns. Vertical silos unable to assert their exclusivity over their domains into a common resource set due to the integration of all of the platforms. And the faster development and release cycle required by the organization makes the classical design by RFE thrown over the wall achingly outmoded and ill-suited to meet the organization’s requirements.
An Organizational RFE
There’s been much discussion about the organizational change (RFE) that is needed. Some call it DevSecOps, others call it horizontal integration, or delivery teams. By whatever name, the idea is the same; the organization is focused on the delivery of a requirement, not the delivery of a siloed infrastructure. There still needs to be coordination in those specialties to ensure that the organization does not duplicate effort, or work at cross-purposes, but the driver is to responsively deliver integrated solutions, not dictate adherence to tactical capabilities that were (potentially) developed in a vacuum.
What does this mean for Security teams?
Let’s think about what it means to secure a container or cloud-native environment, and is it possible for a single team, or product to execute on those requirements. Some example requirements are listed below. They are by no means even close to exhaustive:
- The underlying platform must be secure (OS, orchestration system, etc.)
- The provenance and identity of the containers that Kubernetes is managing/life-cycling on your behalf must be validated and non-reputable.
- The containers need to be statically analyzed or scanned for CVE’s and other vulnerabilities.
- The components that make up the container must come from trusted sources.
- Storage must enforce ACLs.
- Storage must encrypt data in transit and at rest.
- The network must enforce a least-privilege or zero-trust model at all layers where communication happens.
- The network must enforce encryption and authentication of actual flows.
- Users should only be able to perform operations that are required by their tasking (the concept of least privilege and/or zero trust).
- Services need to authenticate and authorize at the application layer.
- Service discovery needs to be trustworthy.
- Logs must be accurate, immutable, and non-reputable.
Given the diversity of this example list, it is fairly obvious that there is no single solution, product, etc. that provides all of these capabilities. You will also notice that most are part of larger sets of capabilities, such as storage or networking. Given that these requirements must all be met for a secure cloud-native or container environment, I would posit that saying that some specific component is a cloud-native or container security product or solution is maybe a bit of an overreach. There are many products or solutions that contribute to a secure environment, usually as part of a larger set of necessary functions.
Often spoken about but not executed is that the security of the environment must be addressed by all of the components that enable the environment. To effectively do that, security must be designed into those components from the start.
So, I would suggest that there really can’t be a single product or solution category called “container security.” Let’s look at it from the other side. If we call a solution or product a “container security” product, does that really mean that deploying that one thing makes your environment secure? I think not. Instead, you might characterize specific solutions and products as being a secure (or insecure) implementation of their function(s). Security needs to be pervasive and diffuse in these environments. Let’s recognize this by correctly identifying that security is the responsibility of all the components in the system and that there is no magic silver bullet.