Many years ago, more than I care to count or remember, I was a network engineer in a remote location with a fairly substantial logistics warehouse facility. Everything was in a database, but, alas, there wasn’t really much of a process in how things in the database should be identified. Some items could be exactly the same but named 5 or more different ways. This lead to duplicated ordering, confusion when searching for a part, and general chaos. A database, or any naming scheme, without a taxonomy, will rapidly become chaotic as the system grows.
Here at Tigera, we make substantial use of various bits of metadata available in the Kubernetes system, including labels and namespaces. This is in-line with the functioning of Kubernetes network policy, but that metadata is used for many more tasks in Kubernetes than just network policy.
When we help our customers design their security policies, more often than not, we actually spend quite a bit of time discussing naming taxonomies, especially the scheme around naming the key and value components of a label, and the convention around namespaces. However, we are also aware that these resources are used for much more than just network policy in Kubernetes. Therefore, the scheme that is developed needs to accommodate all use cases, not just the network policy use case.
What goes into a name?
While a naming taxonomy doesn’t need to be and shouldn’t be overly complex, there are a few things to get right, as redeveloping a new schema when the prior one falls apart is not anyone’s idea of fun.
Here are some things that you should consider when developing your taxonomy:
- Keys in a kubernetes label can only have one value
- Think about states when coming up with key/value pairs
- A scheme for Keys and Namespaces should plan the use of RBAC controls
- Labels should not define an endpoint, but maybe functions, ‘personalities’, or capabilities
- An endpoint can only belong to a single namespace; be quite careful what you use namespaces for
- A pod should only offer a single service, but that isn’t always going to be the case
- Define a clear process of creating new keys and namespaces, and limit access
Let’s look at these in a bit more detail.
Keys are limited to a single value
This means that some of the immediate concepts, such as role: LDAPServer may not make sense. If a given pod has multiple roles, you’ve got a problem, as you can’t have two labels on the same object with the same key, i.e., role: LDAPServer and role: SYSLOGclient. Values also can’t be a dictionary of entries.
You might think that a given microservice would only have one role, and in the perfect world, that might be correct; but we don’t live in a perfect world. What about that monolith that you were just told to lift-and-shift from a VM? It might have 5 functions! In fact, in Tigera’s solution, we use the same labeling scheme for workloads that aren’t pods, such as VMs and even native bare-metal hosts. So instead, maybe use a producer/consumer model instead of a role model.
In the previous example, the labels might be:
- LDAP: Producer
- SYSLOG: Consumer
And values can define states
Similarly, using a key/value pair of Compliance: PCI may be problematic, as you might have a pod that is both PCI and GDPR compliant. You could end up with three values, then PCI, GDPR, and PCI_GDPR. But that would mean that if you wanted to match endpoints that were PCI compliant, your label selector would have to match on both GDPR and PCI_GDPR, and possibly GDPR_PCI, assuming someone will forget the correct ordering.
To handle this case, maybe you might want to use:
- PCI: Compliant
- GDPR: Compliant
- PCI: Gateway (for workloads that act as PCI – nonPCI bridges)
- GDPR: NonCompliant
Another related approach might be to classify by the data being handled, rather than the compliance state of the endpoint. For example:
- PCI: Tainted
- PCI: Clean
- GDPR: Tainted
- GDPR: Clean
These are just examples, but they do start you thinking in a potentially useful way.
Plan for RBAC
Right now, you might trust everyone, but, in time, you will not. Kubernetes RBAC capabilities come in quite handy when limiting the blast radius of bad judgment, or a bad actor. Therefore, you might want to plan on some form of grouping of keys and namespaces to orient them to RBAC groups. This could simply be reserving some keys and namespaces for given RBAC groups or using prefixes in the name to indicate which RBAC groups have access to those objects.
Labels are not synonymous with a service or pod name
Today you have a single monolithic database that handles all of your customer records. That’s great. So, obviously, you just label it CustRecord: Producer and write all of your policies to it. Life is good, until you have to split out the customer PII data for GDPR, the customer payment data for PCI, and the customer prescription records for HIPPA. You knew they were coming, but you were hoping the compliance police were going to forget about you (they never do, btw). So, now you are going to have a number of segmented customer databases, and you’ll need to change all of your policies. If instead, you had attached the following labels to your database and written the correct policies, you would have only had to change where the appropriate labels were attached, and the policies would have been automatically updated with the correct memberships.
- CustOrderRecord: Producer
- CustHIPPARecord: Producer
- CustPmntRecord: Producer
- CustPiiRecord: Producer
You will never get this 100% correct, but with a bit of forethought, you can get surprisingly close.
Don’t overuse namespaces
A given endpoint can only belong in one namespace, so please use them only for coarse segregation, in cases where there can never be an overlap, for example, tenancy or dev, test, and prod. Anything more complex will lead to potential pain down the road.
Who gets to create keys and namespaces
The most beautifully designed taxonomy won’t stay so if anyone is allowed to create anything without review. It’s always easier to create your own key rather than find the ‘blessed’ one, especially if “it will only ever be used for this one private case, I promise.” That said, new keys and namespaces will always be needed. It’s best to sort out, in advance, the process to create new keys and namespaces, how to find out what the current dictionary is, and how to review the new additions and gazette them so that the model stays clean and maintainable.
This is not meant to be a how-to guide on what labels and namespaces to use, but hopefully, it gets you to think a bit about a taxonomy or schema to use for them when it comes time to plan out your Kubernetes deployment.