Istio Service Mesh: Ready reckoner

Vijay Chaudhary
6 min readDec 21, 2024

--

With my 4 years of experience into implementing service mesh I take an opportunity to highlight features of Istio SM from a practitioner’s point of view. I am putting no comments to various sections when I don’t have any critical comment.

https://istio.io/latest/docs/

Traffic Management

  • What’s the problem with K8S based canary deployment?
    - One could create several deployments with the same label but different versions of your application and thus build kind of a canary deployment. For example, having four replicas of the production release and one replica with the new release running within your namespace would roughly result in 20% of the traffic hitting the canary deployment. This obviously doesn’t work well and results in a heavy waste of resources if you want to start with 1% only
  • Fault Injection
Chaos Engineering
  • Traffic Shifting (Istio v/s Nginx ingress v/s K8S)
    - destination:
host: reviews
subset: v1
weight: 50
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "20"
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 10 (10%)
selector:
matchLabels:
app: nginx
version: v1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 90 (90%)
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
version: v2
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
  • TCP Traffic Shifting
  • Request Timeouts
  • Circuit Breaking
  • Mirroring
    - LIVE copy of Production

Locality Load Balancing

( For under the hood deployment please have a look at the story mentioned below)

- Locality failover
- Locality weighted distribution

  • Ingress
    - Ingress Gateways
    - Secure Gateways
    - Ingress Gateway without TLS Termination
    - Ingress Sidecar TLS Termination (SNI-Kafka/Cassandra — stateful apps)
    - Kubernetes Ingress
    - Kubernetes Gateway API
  • Egress
    - Accessing External Services
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
name: httpbin-ext
spec:
hosts:
- httpbin.org
ports:
- number: 80
name: http
protocol: HTTP
resolution: DNS
location: MESH_EXTERNAL
EOF
---
spec:
meshConfig:
outboundTrafficPolicy:
mode: REGISTRY_ONLY

v/s

istioctl install <flags-you-used-to-install-Istio> \
--set meshConfig.outboundTrafficPolicy.mode=REGISTRY_ONLY
---
$ kubectl exec ""$SOURCE_POD"" -c sleep -- curl -sS http://httpbin.org/headers
{
""headers"": {
""Accept"": ""*/*"",
""Host"": ""httpbin.org"",
...
""X-Envoy-Decorator-Operation"": ""httpbin.org:80/*"",
...
}
}
  • - Egress TLS Origination (v/s TLS Termination)
    Additional security considerations (PCI DSS compliance)

- Note that defining an egress Gateway in Istio does not in itself provides any special treatment for the nodes on which the egress gateway service runs. It is up to the cluster administrator or the cloud provider to deploy the egress gateways on dedicated nodes and to introduce additional security measures to make these nodes more secure than the rest of the mesh
- Istio cannot securely enforce that all egress traffic actually flows through the egress gateways. Istio only enables such flow through its sidecar proxies. If attackers bypass the sidecar proxy, they could directly access external services without traversing the egress gateway. Thus, the attackers escape Istio’s control and monitoring. The cluster administrator or the cloud provider must ensure that no traffic leaves the mesh bypassing the egress gateway. Mechanisms external to Istio must enforce this requirement. For example, the cluster administrator can configure a firewall to deny all traffic not coming from the egress gateway. The Kubernetes network policies can also forbid all the egress traffic not originating from the egress gateway (see the next section for an example). Additionally, the cluster administrator or the cloud provider can configure the network to ensure application nodes can only access the Internet via a gateway. To do this, the cluster administrator or the cloud provider can prevent the allocation of public IPs to pods other than gateways and can configure NAT devices to drop packets not originating at the egress gateways.

Deny all outbound traffic + Platform netpol (ingress/egress allowed to/from DNS/logging and monitoring agents, etc) + PCI netpol (rest egress ONLY via Egress-GW)

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
spec:
podSelector: {}
policyTypes:
- Egress #deny all
---
Define a NetworkPolicy to limit the egress traffic from the test-egress namespace to traffic destined to the control plane, gateway, and to the kube-system DNS service (port 53).


cat <<EOF | kubectl apply -n test-egress -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-egress-to-istio-system-and-kube-dns
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kube-system: "true"
ports:
- protocol: UDP
port: 53 #platform netpol
- to:
- namespaceSelector:
matchLabels:
istio: system #PCI netpol
EOF
  • - Egress Gateways (Policy Enforcement/PCI DSS compliance policies v/s Policy Enforcement mentioned below + Suits as NAT-GW)
    - Egress Gateways with TLS Origination
    - Egress using Wildcard Hosts
    - Kubernetes Services for Egress Traffic
    - Using an External HTTPS Proxy

Security

  • Certificate Management (Kickoff: notion of an identity. You give SM a certificate and Istiod will generate a SPIFFE key)
  • - Plug in CA Certificates
    - Custom CA Integration using Kubernetes CSR * (or Vault as an external CA)
  • Authentication
    - Authentication Policy
kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: "default"
namespace: "istio-system"
spec:
mtls:
mode: STRICT #mTLS: PCI DSS Compliance
EOF
  • - JWT claim based routing *
    - Copy JWT Claims to HTTP Headers *
    - Mutual TLS Migration
  • Authorization
    - HTTP Traffic
    - TCP Traffic
    - JWT Token
    - External Authorization
  • - Explicit Deny
    - Ingress Access Control
    - Trust Domain Migration
    - Dry Run *
  • TLS Configuration
    - Istio Workload Minimum TLS Version Configuration

Policy Enforcement

  • Enabling Rate Limits using Envoy
  operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter #Throttling like an API-GW
token_bucket:
max_tokens: 4
tokens_per_fill: 4
fill_interval: 60s
filter_enabled:
runtime_ke
---
$ kubectl exec "$(kubectl get pod -l app=ratings -o jsonpath='{.items[0].metadata.name}')" -c ratings -- bash -c 'for i in {1..5}; do curl -s productpage:9080/productpage -o /dev/null -w "%{http_code}\n"; sleep 1; done'
200
200
200
200
429 #ja das ist jetzt throttled :-)

Observability

  • Telemetry API
  • Metrics
    - Customizing Istio Metrics with Telemetry API
    - Collecting Metrics for TCP Services
    - Customizing Istio Metrics
    - Classifying Metrics Based on Request or Response
    - Querying Metrics from Prometheus
    - Visualizing Metrics with Grafana
  • Logs
    - Configure access logs with Telemetry API
    - Envoy Access Logs (Istio: Powered by Envoy Proxy in the below mentioned story)
  • - OpenTelemetry
  • Distributed Tracing
    - Overview
    - Configure tracing with Telemetry API
    - Configure tracing using MeshConfig and pod annotations
    - Configure trace sampling
    - OpenTelemetry
    - Jaeger
    - Zipkin
    - Apache SkyWalking
3 Actors
Difference between jaeger and zipkin

Jaeger and Zipkin are both open-source distributed tracing tools that help manage and process data from distributed systems:
Jaeger
A good choice for large-scale, complex environments, Jaeger is designed for high scalability and high availability. It has a modern UI with advanced querying capabilities, and integrates well with Kubernetes and CNCF.
Zipkin
A good choice for smaller projects or those using Java and Spring Boot, Zipkin is simple to set up and has a straightforward UI. It supports many programming languages and has a mature ecosystem with a wide range of language-specific libraries.
  • Visualizing Your Mesh (Ich liebe Kiali)

Multi Cluster

(v/s federation, Bitte werfen Sie einen Blick in die unten erwähnte Geschichte — no I translated :-))

  • Before you begin
  • Install Multi-Primary
  • Install Primary-Remote
  • Install Multi-Primary on different networks
  • Install Primary-Remote on different networks

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response