Kuma service mesh trên VPS K3s: setup microservice startup

Chia sẻ bài viết

TL;DR

Kuma service mesh do Kong tạo, đơn giản hơn Istio 10 lần, đủ dùng cho startup có 5-30 microservice.
Chạy trên K3s lightweight Kubernetes, cluster 3 node VPS Cloud TND là sweet spot cho startup giai đoạn product-market fit.
Tính năng cốt lõi: mTLS tự động, traffic permission, retry, circuit breaker, observability Prometheus + Jaeger.
Dataplane Envoy sidecar, controlplane Kuma single binary, install qua kumactl chỉ 1 lệnh.
Tổng chi phí setup: 3 VPS Cloud VPS 80 (799k mỗi cái) = 2.4 triệu/tháng, chạy được 30+ service cho startup B2B.

Startup chuyển từ monolith sang microservice là chuyện thường khi grow lên 10+ engineer. Vấn đề tiếp theo: làm sao service A gọi service B an toàn (mTLS), retry khi fail, circuit break khi B chết, observe latency và lỗi? Đó là vai trò của service mesh. Istio là chuẩn industry nhưng phức tạp khủng khiếp, cần team DevOps riêng. Kuma là alternative nhẹ nhàng hơn nhiều, đủ dùng cho 90% use case startup.

Mình đã chạy Kuma production cho 2 startup fintech và 1 marketplace từ 2024, tổng 18 microservice, traffic 200 req/s peak. Setup ban đầu 1 ngày, maintenance gần như zero. Bài này hướng dẫn full stack: K3s cluster 3 node VPS, cài Kuma, deploy 3 service demo, bật mTLS và observability.

1. Service mesh giải bài toán gì?

Khi có nhiều microservice, mỗi service phải tự lo: encryption TLS, retry, timeout, circuit breaker, metrics, distributed tracing, rate limiting. Mỗi ngôn ngữ phải có library riêng (Node, Python, Go, Java). Service mesh tách concerns này ra sidecar (Envoy proxy chạy bên cạnh container app), application code không cần care.

Without mesh	With Kuma mesh
Mỗi service implement retry logic riêng	Khai báo Retry policy YAML 5 dòng
Tự setup mTLS giữa các service	Bật mTLS = 1 policy, auto rotate cert
Distributed tracing thủ công Jaeger SDK	Auto inject tracing header
Prometheus metric phải tự instrument	Envoy expose metric chuẩn
Rate limit code trong app	Policy RateLimit ở mesh
Khó debug khi service A gọi B fail	Kuma GUI hiện full request graph

2. So sánh Kuma vs Istio vs Linkerd

Tiêu chí	Kuma	Istio	Linkerd
Dataplane	Envoy	Envoy	linkerd2-proxy (Rust)
Controlplane	Single binary Go	Multi component (Pilot, Citadel...)	Multi component
RAM controlplane	~150MB	~1GB	~300MB
Install time	5 phút	30-60 phút	15 phút
Universal mode	Có (VM + K8s)	Có nhưng phức tạp	Chỉ K8s
Learning curve	Dễ	Rất khó	Trung bình
Multi-zone	Native	Cần config	Multi-cluster
License	Apache 2.0	Apache 2.0	Apache 2.0

Cho startup 5-30 service, Kuma win về DX. Khi scale lên 100+ service hoặc cần ecosystem GCP/IBM, Istio mới phát huy thế mạnh.

3. Setup K3s cluster 3 node trên VPS

K3s là Kubernetes distro nhẹ của Rancher, 1 binary 50MB, chạy trên VPS 2GB RAM. Mua 3 Cloud VPS 80 (4GB RAM, 4 core, 80GB SSD) làm cluster.

# Trên node1 (control plane)
curl -sfL https://get.k3s.io | sh -s - server 
  --node-name k3s-master 
  --tls-san k8s.your-domain.com 
  --write-kubeconfig-mode 644 
  --disable traefik

# Lấy token để node worker join
cat /var/lib/rancher/k3s/server/node-token

Trên node2 và node3 (worker):

# Trên worker nodes
K3S_URL=https://1.2.3.4:6443 
K3S_TOKEN=K10...your_token_here 
curl -sfL https://get.k3s.io | sh -s - agent --node-name k3s-worker-1

Verify cluster:

kubectl get nodes
# NAME            STATUS   ROLES                  AGE   VERSION
# k3s-master      Ready    control-plane,master   2m    v1.31.4+k3s1
# k3s-worker-1    Ready                     1m    v1.31.4+k3s1
# k3s-worker-2    Ready                     1m    v1.31.4+k3s1

4. Cài Kuma controlplane vào K3s

# Download kumactl
curl -L https://kuma.io/installer.sh | VERSION=2.9.0 sh -
cd kuma-2.9.0/bin
sudo mv * /usr/local/bin/

# Install vào cluster
kumactl install control-plane --mode=standalone | kubectl apply -f -

# Đợi controlplane ready
kubectl wait --for=condition=Ready pod -n kuma-system -l app=kuma-control-plane --timeout=120s

# Mở GUI dashboard
kubectl port-forward -n kuma-system svc/kuma-control-plane 5681:5681 &
# Mở browser http://localhost:5681/gui

Default mesh "default" tự tạo. Mọi namespace bạn label kuma.io/sidecar-injection=enabled sẽ tự inject Envoy sidecar vào pod.

5. Deploy 3 service demo có sidecar

# Label namespace
kubectl create namespace app
kubectl label namespace app kuma.io/sidecar-injection=enabled

# Deploy 3 service: frontend, api, db-proxy
kubectl apply -n app -f - <<'YAML'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
spec:
  replicas: 2
  selector:
    matchLabels: {app: frontend}
  template:
    metadata:
      labels: {app: frontend}
    spec:
      containers:
      - name: web
        image: nginx:alpine
        ports: [{containerPort: 80}]
---
apiVersion: v1
kind: Service
metadata: {name: frontend}
spec:
  selector: {app: frontend}
  ports: [{port: 80, targetPort: 80}]
YAML

Verify sidecar inject thành công:

kubectl -n app get pods
# Mỗi pod có 2/2 READY (app container + kuma-sidecar)

kumactl inspect dataplanes
# Hiện tất cả dataplane đang chạy

6. Bật mTLS giữa tất cả service

Mặc định mesh "default" mTLS disabled. Bật bằng 1 policy YAML:

kubectl apply -f - <<'YAML'
apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  mtls:
    enabledBackend: ca-1
    backends:
    - name: ca-1
      type: builtin
      dpCert:
        rotation:
          expiration: 24h
      conf:
        caCert:
          RSAbits: 2048
          expiration: 10y
YAML

Sau khi apply, mọi traffic giữa các sidecar Envoy đều bị encrypted bằng mTLS, certificate rotation tự động mỗi 24h. Verify bằng tcpdump trên node: thấy traffic giữa pod là TLS thay vì HTTP plaintext.

7. Traffic Permission: ai được gọi ai

Khi mTLS bật, mặc định không service nào gọi được nhau. Phải explicit allow qua TrafficPermission policy:

kubectl apply -f - <<'YAML'
apiVersion: kuma.io/v1alpha1
kind: TrafficPermission
metadata:
  name: frontend-to-api
spec:
  sources:
  - match:
      kuma.io/service: frontend_app_svc_80
  destinations:
  - match:
      kuma.io/service: api_app_svc_8080
---
apiVersion: kuma.io/v1alpha1
kind: TrafficPermission
metadata:
  name: api-to-db-proxy
spec:
  sources:
  - match:
      kuma.io/service: api_app_svc_8080
  destinations:
  - match:
      kuma.io/service: db-proxy_app_svc_5432
YAML

Đây là zero-trust networking thật sự: chỉ những path bạn declare mới hoạt động. Service mới deploy mặc định không gọi được ai cho đến khi grant permission.

8. Retry và Circuit Breaker policy

apiVersion: kuma.io/v1alpha1
kind: MeshRetry
metadata:
  name: retry-api
  namespace: kuma-system
spec:
  targetRef:
    kind: MeshService
    name: api_app_svc_8080
  to:
  - targetRef:
      kind: Mesh
    default:
      http:
        numRetries: 3
        perTryTimeout: 5s
        backOff:
          baseInterval: 100ms
          maxInterval: 1s
        retryOn:
        - "5xx"
        - "gateway-error"
        - "connect-failure"
---
apiVersion: kuma.io/v1alpha1
kind: MeshCircuitBreaker
metadata:
  name: cb-db-proxy
spec:
  targetRef:
    kind: MeshService
    name: db-proxy_app_svc_5432
  to:
  - targetRef:
      kind: Mesh
    default:
      connectionLimits:
        maxConnections: 50
        maxPendingRequests: 100
      outlierDetection:
        interval: 10s
        baseEjectionTime: 30s
        consecutive5xx: 5

App code không cần biết về retry/circuit breaker. Tất cả handled ở sidecar Envoy. Có thể thay đổi policy realtime không cần redeploy app.

9. Observability: Prometheus, Grafana, Jaeger

Kuma có sẵn integration với observability stack. Bật metrics trên mesh:

kubectl apply -f - <<'YAML'
apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  metrics:
    enabledBackend: prom
    backends:
    - name: prom
      type: prometheus
      conf:
        port: 5670
        path: /metrics
        skipMTLS: true
  tracing:
    defaultBackend: jaeger-collector
    backends:
    - name: jaeger-collector
      type: zipkin
      sampling: 100.0
      conf:
        url: http://jaeger-collector.kuma-system:9411/api/v2/spans
YAML

Cài Prometheus + Grafana + Jaeger qua Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install kube-prom prometheus-community/kube-prometheus-stack -n monitoring --create-namespace

helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
helm install jaeger jaegertracing/jaeger -n kuma-system

# Import dashboard Kuma vào Grafana
# Grafana ID: 13502 (Kuma Service Mesh)

Sau khi cài xong, mở Grafana sẽ thấy biểu đồ request rate, error rate, p99 latency cho từng service mesh. Jaeger UI hiển thị full trace từ frontend tới db-proxy.

10. Traffic routing và canary deployment

Deploy version v2 song song v1, dùng MeshHTTPRoute split traffic 90/10:

apiVersion: kuma.io/v1alpha1
kind: MeshHTTPRoute
metadata:
  name: api-canary
spec:
  targetRef:
    kind: Mesh
  to:
  - targetRef:
      kind: MeshService
      name: api_app_svc_8080
    rules:
    - matches:
      - path:
          type: PathPrefix
          value: /
      default:
        backendRefs:
        - kind: MeshServiceSubset
          name: api_app_svc_8080
          tags:
            version: v1
          weight: 90
        - kind: MeshServiceSubset
          name: api_app_svc_8080
          tags:
            version: v2
          weight: 10

Monitor lỗi v2 qua Grafana trong vài giờ, nếu OK tăng dần weight v2 lên 50/50 rồi 0/100 và xoá deployment v1. Đây là canary deployment 5 phút setup, không cần Argo Rollouts hay Flagger.

11. Rate limit và bảo vệ API gateway

apiVersion: kuma.io/v1alpha1
kind: MeshRateLimit
metadata:
  name: rate-limit-public-api
spec:
  targetRef:
    kind: MeshService
    name: api_app_svc_8080
  to:
  - targetRef:
      kind: Mesh
    default:
      local:
        http:
          requestRate:
            num: 100
            interval: 1s
          onRateLimit:
            status: 429
            headers:
              add:
              - name: x-rate-limit
                value: exceeded

Quan trọng cho public API: tránh DDoS hoặc bot scrape. Combine với traffic permission, mesh trở thành lớp bảo vệ đầu tiên.

12. Tips production và pitfall thường gặp

Cert rotation 24h gây pod restart nhẹ: setup PodDisruptionBudget tránh service downtime khi rotate đồng loạt.
Sidecar inject làm pod start chậm hơn 2-3s: chấp nhận trade-off, set readinessProbe initialDelay phù hợp.
mTLS giữa service và database ngoài mesh: không hoạt động, dùng External Service resource để bypass.
Multi-zone setup cho HA: chạy global controlplane + zone controlplane mỗi region, dùng Kuma multi-zone mode.
Backup config mesh hàng ngày: kumactl export > backup.yaml, lưu vào git để rollback nhanh.

13. Khi nào không nên dùng service mesh?

App monolith chỉ có 1-3 service: dùng overkill, code TLS bằng tay là đủ.
Traffic rất thấp (vài req/s): sidecar overhead 20-50% chi phí, không đáng.
Team chưa quen Kubernetes: học K8s trước, mesh sau.
Latency siêu nhạy (HFT, game realtime): sidecar thêm 1-3ms latency, có thể không chấp nhận được.

Sweet spot: 5-50 microservice, team 5-30 dev, traffic 10-1000 req/s. Đó là zone Kuma toả sáng.

14. Cost breakdown thực tế

Resource	Cấu hình	Chi phí/tháng
3 VPS K3s nodes	Cloud VPS 80 x3 (4GB RAM, 4 core)	2.397.000đ
Storage SSD CEPH	240GB tổng	Included
Backup snapshot	Daily	Included
Bandwidth	200Mbps trong nước	Included
Kuma + K3s + Prometheus + Jaeger	Open source	0đ
Tổng		2.397.000đ/tháng

So với GKE/EKS managed: rẻ hơn 70-80%, đặc biệt khi tính cost egress data. Trade-off: bạn phải tự maintenance K3s upgrade, nhưng K3s rất stable, mình upgrade 6 tháng/lần không có vấn đề.

15. Backup mesh policy và disaster recovery

Toàn bộ policy mesh (TrafficPermission, MeshRetry, MeshHTTPRoute) là YAML, nên commit vào git repo riêng và GitOps deploy bằng ArgoCD hoặc Flux. Khi cluster chết, restore từ git là xong:

# Export tất cả policy hiện tại
kumactl export --include=policy > /opt/backup/kuma-policy-$(date +%F).yaml

# Commit lên git repo private
cd /opt/gitops-mesh
cp /opt/backup/kuma-policy-*.yaml .
git add . && git commit -m "Backup mesh $(date +%F)" && git push

# Restore khi cần
kubectl apply -f kuma-policy-2026-06-13.yaml

ArgoCD setup sync tự động repo gitops-mesh vào cluster mỗi 5 phút. Mọi thay đổi policy đi qua PR review, audit trail đầy đủ.

16. Roadmap nâng cấp từ single zone lên multi-zone

Khi traffic vượt khả năng 1 cluster hoặc cần DR multi-region, chuyển sang Kuma multi-zone:

Global controlplane: 1 cluster master quản lý policy.
Zone controlplane: mỗi region (HN, HCM) có 1 cluster với zone CP.
Cross-zone traffic: tự encrypt mTLS, latency thêm 10-30ms cross region.
Failover: khi zone HN chết, traffic auto reroute sang HCM nếu service replicate đủ.

Mình recommend startup ở Việt Nam dùng 2 zone: 1 HN, 1 HCM, dùng Cloud VPS TND có datacenter cả 2 thành phố. Latency cross zone khoảng 15-20ms, đủ cho hầu hết app B2B. Setup multi-zone tham khảo docs Kuma chính thức, mất 1-2 ngày nếu đã quen single zone.

17. Tổng kết và next step

Setup Kuma trên K3s VPS là sweet spot cho startup Việt Nam đang grow từ monolith sang microservice. Trade-off rõ ràng: thêm 20% overhead RAM/CPU cho sidecar, đổi lại có mTLS, retry, observability, traffic management mà không cần viết code. Sau 1 năm production, mình confident recommend stack này cho team 5-30 dev với ngân sách hosting dưới 5 triệu/tháng.

Next step gợi ý: setup CI/CD GitOps bằng ArgoCD, observability stack đầy đủ Prometheus + Grafana + Loki + Jaeger, và security scanner Trivy cho image. Combine với Kuma sẽ thành platform engineering nội bộ ngon cho startup tăng trưởng.

Cloud VPS cho vibe coder

VPS chạy K3s + Kuma mesh cho startup microservice

Cloud VPS TND sẵn AlmaLinux 9, Ubuntu 22/24, Debian 12/13. SSD CEPH, snapshot 1-click, backup hằng ngày, network 200Mbps trong nước. Cluster 3 node Cloud VPS 80 đủ chạy 30+ microservice cho startup giai đoạn product-market fit.

Xem 8 cấu hình Cloud VPS →

FAQ

Kuma khác Kong API Gateway thế nào?

Kong là API gateway (north-south traffic, client to service). Kuma là service mesh (east-west traffic, service to service). Cùng do Kong Inc tạo, có thể tích hợp với nhau qua Kong Mesh enterprise. Cho startup self-host, dùng cả 2: Kong làm gateway public, Kuma làm mesh nội bộ.

K3s khác Kubernetes vanilla ra sao?

K3s là Kubernetes nhẹ của Rancher: bỏ legacy code (cloud provider, alpha API), thay etcd bằng SQLite cho single-node, package thành 1 binary 50MB. Tương thích 100% Kubernetes API, mọi tool kubectl/Helm work bình thường. Nhược điểm: không hỗ trợ một số CRD chuyên dụng cloud, nhưng cho on-prem VPS thì K3s tốt hơn vanilla.

Sidecar Envoy ăn bao nhiêu RAM?

Mỗi sidecar Envoy chiếm khoảng 50-80MB RAM idle. 20 pod thì tổng 1-1.6GB RAM cho sidecar. CPU overhead 10-20% per request. Cho VPS 4GB RAM chạy được khoảng 30-40 pod với sidecar. Lên scale nhiều dùng node có 8GB+ RAM.

Có dùng Kuma cho monolith không?

Không cần thiết. Service mesh giải bài toán complexity của microservice. Monolith có 1 service, chỉ cần TLS cert ở reverse proxy là đủ, mesh overkill. Khi tách monolith thành 5-10 service mới nên xem xét mesh.

Kuma có hỗ trợ Universal mode cho VM không?

Có, đây là tính năng mạnh của Kuma so với Linkerd. Bạn có thể mix service trên K8s và service trên VPS truyền thống vào cùng 1 mesh. Hữu ích khi migration từ monolith VM sang microservice K8s, chạy song song trong giai đoạn transition.