According to the recent Datadog report on real world container usage, Redis is among the top 5 technologies used in containerized workloads running on Kubernetes.
Redis database is deployed across multi-region clusters to be Highly Available(HA) to a microservices application. However, while Kubernetes mandates how the networking and security policy is deployed and configured in a single cluster it is challenging to enforce inter-cluster communication at pod-level, enforce security policies and connect to services running in pods across multiple clusters.
Calico Clustermesh provides an elegant solution to highly available multiple Redis clusters without any overheads. By default, deployed Kubernetes pods can only see pods within their cluster.
Using Calico Clustermesh, you can grant access to other clusters and the applications they are running. Calico Clustermesh comes with Federated Endpoint Identity and Federated Services.
Federated endpoint identity
Calico federated endpoint identity and federated services are implemented in Kubernetes at the network layer. To apply fine-grained network policy between multiple clusters, the pod source and destination IPs must be preserved. So the prerequisite for enabling federated endpoints requires clusters to be designed with common networking across clusters (routable pod IPs) with no encapsulation.
Federated services
Federated services works with federated endpoint identity, providing cross-cluster service discovery for a local cluster. Federated services use the Tigera Federated Services Controller to federate all Kubernetes endpoints (workload and host endpoints) across all of the clusters. The Federated Services Controller accesses service and endpoints data in the remote clusters directly through the Kubernetes API.
HA and service federation use case
Overview
Let’s get started. Setup will have client application and a target application.
- A client application that needs to connect and transact constantly with another critical application/service without which the client application essentially ceases to fulfill its primary business function.
For the client application, we will be using the Online Boutique/Hipstershop demo microservices application developed by Google that requires a connection to a Redis database to create necessary tables and store the state of the online store application. - A target application/svc being called by the client application that needs continuous uptime and be highly-available that we will be federating across multiple clusters.
The application/service we will be federating will be Redis deployed in an Active-Active configuration across multiple clusters.
Client microservices application: Hipstershop
How the Hipstershop ‘cartservice’ utilizes Redis
In the diagram above, the ‘Redis cache’ is the piece we are interested in federating. The ‘cart’ or ‘cartservice’ pod is responsible for consuming Redis as a K8s Service. The ‘cartservice’ pod does this by utilizing a container environment variable called REDIS_ADDR pointing to the DNS name of the Redis K8s Service in its Deployment spec as shown below:
apiVersion: apps/v1 kind: Deployment metadata: name: cartservice spec: selector: matchLabels: app: cartservice template: metadata: labels: app: cartservice spec: serviceAccountName: default terminationGracePeriodSeconds: 5 containers: - name: server image: gcr.io/google-samples/microservices-demo/cartservice:v0.5.1 ports: - containerPort: 7070 env: - name: REDIS_ADDR value: "redis-cart:6379" resources: requests: cpu: 200m memory: 64Mi limits: cpu: 300m memory: 128Mi readinessProbe: initialDelaySeconds: 15 exec: command: ["/bin/grpc_health_probe", "-addr=:7070", "-rpc-timeout=5s"] livenessProbe: initialDelaySeconds: 15 periodSeconds: 10 exec: command: ["/bin/grpc_health_probe", "-addr=:7070", "-rpc-timeout=5s"]
The Redis service being consumed is a pod/svc that is also deployed as part of the default config. We will be replacing this single basic Redis pod deployment with a full Redis Enterprise Cluster deployment backed by a database service installed on a multi-region, multi-cluster setup and federating out the database service.
Lab environment and setup
Reference Github Repository
The environment can be set up in any Azure account and using the Azure Shell.
Getting Started
Azure Components
The first step is to understand the Azure AKS multi-cluster environment we will be setting up and working in.
- Two regions for the Azure resources/resource groups to be deployed in, westus and canadacentral
- One vnet and minimum of one subnet per region deployed
- Vnet peering enabled and setup between the vnets in the two regions
- Azure CNI will be used to allow for pods to get IPs allocated from the Vnets and to allow for those pod IPs to be routable between Peered Vnets
- Deploy an AKS cluster in each region with their respective configs
Deploying Redis in active-active HA mode
The Redis on Kubernetes architecture uses an operator-based deployment to deal with the nuances associated with deploying a Redis Enterprise Cluster. The Redis Enterprise API as well as the database services are deployed to a pod image with quorum maintained with minimum 3 pods per deployment in a cluster. Node taints/tolerations are applied such that ideally there is one Redis pod per worker node (requiring 3)
The Active-Active Redis Enterprise Cluster uses an Ingress resource to allow the API and databases to sync. As it is difficult to tweak the Redis Operator to decouple from this model currently, we will create the ingress resources but then federate the ClusterIP Redis database service between both the clusters for the Hipstershop cartservice
to point to.
Then the Redis Enterprise Cluster is deployed on both clusters with identical namespace name created on both clusters as it is a pre-requisite for service federation due to how kube-dns uses namespace as part of the service FQDN in Kubernetes.
In each cluster, a ClusterIP database service (named testdb in this example) gets set up backing a clean empty Active-Active Redis database now syncing between two clusters. This is the target service for the federation.
apiVersion: v1 kind: Service metadata: annotations: redis.io/last-keys: '[]' creationTimestamp: "2023-04-05T19:54:11Z" labels: app: redis-enterprise federation: "yes" redis.io/bdb: "2" redis.io/cluster: demo-clustera name: testdb namespace: redis ownerReferences: - apiVersion: app.redislabs.com/v1alpha1 blockOwnerDeletion: true controller: true kind: RedisEnterpriseCluster name: demo-clustera uid: 485d9f4d-643b-4c0d-90da-07bd6b052b19 resourceVersion: "518731" uid: 14d08a31-cd9b-4aba-a203-ae35bd34f437 spec: clusterIP: 10.0.27.178 clusterIPs: - 10.0.27.178 internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: redis port: 11069 protocol: TCP targetPort: 11069 selector: redis.io/bdb-2: "2" sessionAffinity: None type: ClusterIP status: loadBalancer: {}
In the lab setup with the two clusters, we can check the endpoints to see the IPs of the backing pods for the active Redis svc in each cluster.
westus cluster:
# kubectl get endpoints -n redis | grep testdb testdb 10.0.0.40:11069
canadacentral cluster:
# kubectl get endpoints -n redis | grep testdb testdb 10.1.0.131:11069
Creating federated endpoints
Before a service can be federated, first clusters need to be aware of each other and collect each other’s remote pod endpoints, which we call federated endpoints. Calico implements a Custom Resource Definition (CRD) object called RemoteCluster that is deployed on a local cluster to allow it to reference and collect endpoints from a remote cluster. In a 2-cluster scenario for example, this object needs to be applied on both clusters where each cluster is able to find, authenticate and successfully full endpoints of the pods in the other (remote) cluster.
This consists of a few steps provided that the prerequisite of pod-to-pod routing is enabled and working between the clusters:
- Create kubeconfig files for each cluster
- Create secrets for each cluster to authenticate to the other remote cluster
- Create proper RBAC access to the secrets for the clusters
- Add remote cluster configurations using the RemoteCluster CRD
Finally, once the configs are applied, the list of remote endpoints can be fetched and verified by the calicoq CLI tool where the endpoints are prefixed with the name of the RemoteCluster object designated for the remote cluster.
Ex: for a RemoteCluster object named calico-demo-remote-canadacentral
, the remote redis pods endpoints list may look like:
Workload endpoint calico-demo-remote-canadacentral/aks-nodepool1-86764462-vmss000000/k8s/redis.demo-clusterb-1/eth0 Workload endpoint calico-demo-remote-canadacentral/aks-nodepool1-86764462-vmss000001/k8s/redis.demo-clusterb-0/eth0 Workload endpoint calico-demo-remote-canadacentral/aks-nodepool1-86764462-vmss000002/k8s/redis.demo-clusterb-2/eth0
Deploying the client application: Google Online Boutique demo/Hipstershop
The Google Online Boutique application YAML configuration needs to be modified from the standard configuration in order to include the changes to the cartservice pods to point to the ‘testdb’ Redis database service.

The service graph shows the associations between hipstershop and redis testdb svc

Accessing the Hipstershop frontend
The frontend-external service provides a public LoadBalancer IP to access the frontend. With everything working correctly, the app can be accessed and shows the online store.
~# kubectl get svc -n hipstershop | grep LoadBalancer
frontend-external LoadBalancer 10.0.24.223 20.1.14.187 80:31464/TCP 19d
Items can be added to a cart, which makes use of the cartservice
microservice/app, and purchased
in the mock demo app.
For testing the cartservice
state we want to keep items in the cart so we can see how the state of the application is affected when it cannot query the Redis database service anymore.
Inducing a failure scenario and making Redis unavailable in one cluster
For seeing the effect of ‘breaking’ the Redis service on the application, we can put Redis into a ‘recovery mode’ to induce failure and see the frontend break where the ‘cartservice’ is unable to reach Redis.
This is because at this point, we have not actually created a federated service or made any config changes to the testdb service. Let’s go ahead and do that now.
Federating the Redis database service
As we already have federated endpoints and both clusters are aware of each other’s remote endpoints, we can move to the step of actually federating the Redis testdb service that was initially created.
Service federation requires the following prerequisites/considerations:
- Since a federated service is a set of services with consolidated endpoints, it looks like a regular K8s Service but instead of using a pod selector, it uses an annotation which must be applied to the service/s it is backing.
- Only services in the same namespace as the federated service are included. This implies namespace names across clusters are linked (this is a basic premise of federated endpoint identity).
First we need to label the testdb with an annotation service in all our clusters so that the Tigera controller can federate them. Do this in each cluster:
kubectl label svc -n redis testdb federation=yes
Then we apply the federated svc YAML config containing the testdb svc name to be federated as well as the required ‘special’ annotation federation.tigera.io/serviceSelector
with the previously setup svc annotation that tells the Tigera Federated Services controller to target the service:
apiVersion: v1 kind: Service metadata: name: testdb-federated namespace: redis annotations: federation.tigera.io/serviceSelector: federation == "yes" spec: ports: - name: redis port: 11069 protocol: TCP type: ClusterIP
Once the service is created and federated, we can now see that the testdb service has been populated with pod endpoints of the service active on the other remote
cluster even while it’s local
testdb service has no endpoints as the Redis database/svc has been taken down.
On westus, with the local endpoint of testdb being empty due to the Redis svc being taken down, there is no local testdb
endpoint, but the federated testdb-federated
still has the remote cluster pod’s endpoint available from the canadacentral cluster.
# kubectl get endpoints -n redis | grep testdb testdb <none> 19d testdb-federated 10.1.0.131:11069 18d
The federated endpoints look like:
kubectl get endpoints testdb-federated -n redis -oyaml
apiVersion: v1 kind: Endpoints metadata: annotations: federation.tigera.io/serviceSelector: federation == "yes" creationTimestamp: "2023-04-06T16:12:39Z" name: testdb-federated namespace: redis resourceVersion: "2901443" uid: 7414e1d3-594e-441d-ba7c-669c530cefd0 subsets: - addresses: - ip: 10.1.0.131 nodeName: aks-nodepool1-20240168-vmss000004 targetRef: kind: Pod name: calico-demo-remote-canadacentral/demo-clusterb-2 namespace: redis resourceVersion: "2899484" uid: f1e3630d-fffa-4475-8893-d6f1aacb74f6 ports: - name: redis port: 11069 protocol: TCP
Here we can see that the remote cluster calico-demo-remote-canadacentral
is still advertising its pod endpoint with working service to the federated service.
Finally, the cartservice needs to leverage the new federated-testdb
service which can be changed in its deployment’s env variable REDIS_ADDR
to testdb-federated
apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "4" kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"cartservice","namespace":"hipstershop"},"spec":{"selector":{"matchLabels":{"app":"cartservice"}},"template":{"metadata":{"labels":{"app":"cartservice"}},"spec":{"containers":[{"env":[{"name":"REDIS_ADDR","value":"testdb.redis:11069"}],"image":"gcr.io/google-samples/microservices-demo/cartservice:v0.5.1","livenessProbe":{"exec":{"command":["/bin/grpc_health_probe","-addr=:7070","-rpc-timeout=5s"]},"initialDelaySeconds":15,"periodSeconds":10},"name":"server","ports":[{"containerPort":7070}],"readinessProbe":{"exec":{"command":["/bin/grpc_health_probe","-addr=:7070","-rpc-timeout=5s"]},"initialDelaySeconds":15},"resources":{"limits":{"cpu":"300m","memory":"128Mi"},"requests":{"cpu":"200m","memory":"64Mi"}}}],"serviceAccountName":"default","terminationGracePeriodSeconds":5}}}} creationTimestamp: "2023-04-05T20:08:16Z" generation: 4 name: cartservice namespace: hipstershop resourceVersion: "2753897" uid: 8abc3df1-8f68-4301-9875-20fa76535461 spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: cartservice strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: cartservice spec: containers: - env: - name: REDIS_ADDR value: testdb-federated.redis:11069 image: gcr.io/google-samples/microservices-demo/cartservice:v0.5.1 imagePullPolicy: IfNotPresent livenessProbe: exec: command: - /bin/grpc_health_probe - -addr=:7070 - -rpc-timeout=5s failureThreshold: 3 initialDelaySeconds: 15 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: server ports: - containerPort: 7070 protocol: TCP readinessProbe: exec: command: - /bin/grpc_health_probe - -addr=:7070 - -rpc-timeout=5s failureThreshold: 3 initialDelaySeconds: 15 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: limits: cpu: 300m memory: 128Mi requests: cpu: 200m memory: 64Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 5 status: availableReplicas: 1 conditions: - lastTransitionTime: "2023-04-05T20:08:16Z" lastUpdateTime: "2023-04-06T16:14:12Z" message: ReplicaSet "cartservice-659f98f64f" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing - lastTransitionTime: "2023-04-24T17:18:00Z" lastUpdateTime: "2023-04-24T17:18:00Z" message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available observedGeneration: 4 readyReplicas: 1 replicas: 1 updatedReplicas: 1
Once this is done, the cartservice pods will update and once again make the Hipstershop web page accessible by using the remote cluster’s Redis federated service even if the local svc is unavailable thus ensuring HA.
Now when the web page is refreshed, the app cart page should come back up and retain its state with the items in the cart, showing that the Redis database entries were replicated properly across the other cluster and retrieved.
Conclusion
Keeping a critical database service like Redis highly available for microservices to consume in a multi-region, multi-cluster configuration can be challenging with Kubernetes’ native single-cluster architecture model. Calico cluster-mesh can address this use-case by federating pod endpoints and services for real-time databases and other similar applications.
Want to learn more? Get started with a free Calico Cloud trial.
Join our mailing list
Get updates on blog posts, workshops, certification programs, new releases, and more!