A service mesh is a dedicated infrastructure layer for managing, controlling, and observing communication between microservices in a distributed system. It uses lightweight network proxies, called sidecars, which are deployed alongside each service instance.
These sidecars intercept and manage inter-service traffic, allowing for features such as load balancing, traffic routing, security enforcement, and observability. By abstracting these concerns away from the application code, service meshes help improve the overall scalability, resilience, and maintainability of microservices-based applications. At the same time, service meshes introduce challenges including additional complexity, performance overhead, limited adoption, and maturity of the technology.
This is part of a series of articles about DevSecOps.
In this article:
Adopting a service mesh brings a significant advantage to organizations that develop and maintain complex, large-scale, and distributed microservices-based applications. The primary advantage is the decoupling of the application’s business logic from the operational concerns of managing and controlling inter-service communication.
This separation allows developers to focus on writing and deploying their code, while the service mesh handles the underlying complexities of the network, such as load balancing, traffic routing, and security enforcement. As a result, organizations can achieve faster development cycles, reduced deployment risks, and increased agility in responding to changing business requirements.
Additionally, a service mesh provides a unified and consistent way to monitor, manage, and troubleshoot microservices interactions. By centralizing the control and observability of inter-service communication, organizations can more easily diagnose and resolve issues, optimize network performance, and maintain a high level of reliability and resilience across the entire application.
Related content: Read our guide to shift left security
API gateways and service meshes are two different architectural components that address different concerns in a microservices-based application. While they may have some overlapping functionality, they serve distinct purposes and are often used together for a comprehensive solution.
API Gateway:
Service mesh:
Related content: Read our guide to DevSecOps best practices
In summary, an API gateway is focused on managing external client access to microservices, whereas a service mesh manages communication between microservices within the distributed system. Both components can complement each other, with the API gateway serving as the ingress point for external requests and the service mesh ensuring reliable and secure communication within the application.
Service mesh architecture is designed to provide a dedicated infrastructure layer for managing, controlling, and observing communication between microservices in a distributed system. Key components and concepts of service mesh architecture include:
The data plane is responsible for managing the traffic between microservices. It consists of lightweight network proxies, called sidecars, that are deployed alongside each microservice instance. Sidecars intercept and manage inter-service communication, handling tasks such as load balancing, traffic routing, and enforcing security policies.
The control plane is the central management layer of the service mesh, responsible for configuring and monitoring the data plane. It provides an interface for users to define and enforce policies, configurations, and traffic rules. The control plane also collects metrics, logs, and tracing information from the sidecars to provide observability into the microservices’ communication.
A sidecar proxy is a lightweight network proxy deployed alongside each microservice instance, which intercepts and manages the traffic between microservices. Sidecar proxies are typically implemented using technologies like Envoy or Linkerd. They are responsible for load balancing, traffic routing, circuit breaking, retries, timeouts, and enforcing security policies such as mutual TLS.
A service mesh enables fine-grained traffic management, allowing users to control and manipulate the flow of traffic between microservices. This includes features such as request routing based on criteria like headers, weights, or versions, load balancing algorithms, fault injection for testing purposes, and traffic shifting for canary deployments or blue-green rollouts.
A service mesh provides built-in observability for the entire microservices ecosystem. It typically includes collecting metrics for performance and resource usage, distributed tracing for end-to-end visibility of request flows, and log aggregation for troubleshooting and analysis. This information can be consumed by monitoring and visualization tools, helping users understand the behavior and health of their microservices.
A service mesh can enhance the security of microservices communication by providing features like mutual TLS for encrypting traffic and ensuring the identity of communicating services. Additionally, it can enforce fine-grained access control policies based on various criteria such as service identity, request attributes, or metadata.
The service mesh architecture helps increase the overall resiliency of microservices-based applications by providing features like circuit breaking, retries, and timeouts, which mitigate the impact of failures, delays, and network issues. These features help improve the fault tolerance of the system and ensure the availability of critical services.
In a service mesh architecture, the data plane and control plane work together to provide a comprehensive solution for managing, controlling, and observing microservices communication, allowing developers to focus on their application’s business logic while the service mesh handles the complexities of inter-service communication.
Developer velocity
By abstracting operational concerns like networking, security, and resiliency from the application code, service mesh enables developers to focus on implementing business logic and features. This separation of concerns accelerates development cycles and reduces the complexity of microservices codebases.
Consistent policy enforcement
A service mesh provides a unified and consistent way to enforce policies and configurations across all microservices in a distributed system. This simplifies the management of large-scale applications and ensures that all services adhere to organizational standards and best practices.
Vendor-neutrality
Service mesh implementations are often platform-agnostic and can be used across various cloud providers, container orchestration systems, or on-premises environments. This flexibility enables organizations to adopt a consistent networking and management solution across their entire infrastructure.
Added complexity
Implementing a service mesh introduces additional components and complexity to the system, such as sidecar proxies, control plane components, and the need to manage configurations and policies. This might increase the learning curve for teams and require new skills or expertise to manage the service mesh effectively.
Performance overhead
The use of sidecar proxies for intercepting and managing traffic between microservices can introduce latency and additional resource consumption. While modern service mesh implementations are designed to minimize this overhead, organizations should carefully evaluate the impact of the service mesh on their application’s performance and resource utilization.
Maturity and adoption
Although service mesh technologies have been gaining traction in recent years, they are still relatively new compared to traditional networking and management solutions. Some organizations might encounter challenges related to the maturity of certain features, community support, or integration with existing tooling and processes.
A service mesh adds operational complexity and introduces an additional control plane for teams to manage. Platform owners, DevOps teams, and SREs have limited resources, so adopting a service mesh is a significant undertaking due to the resources required for configuration and operation.
Calico enables a single-pane-of-glass unified control to address the three most popular service mesh use cases—security, observability, and control—with an operationally simpler approach, while avoiding the complexities associated with deploying a separate, standalone service mesh. With Calico, you can easily achieve full-stack observability and security, deploy highly performant encryption, and tightly integrate with existing security infrastructure like firewalls.
Next steps: