2 Şubat 2022 Çarşamba

Kubernetes kind: HorizontalPodAutoscaler - Yeni Pod Ekler/Siler

Giriş
Kubernetes ile farklı autoscaler'lar kullanılabiliyor. Açıklaması şöyle
Some of the standard tools used for autoscaling workloads on Kubernetes are Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), Cluster Proportional Autoscaler (CPA) and Cluster Autoscaler (CA)
Diğer autoscaling yöntemleri de şöyle
1. Scheduled Autoscaling
2. Reactive Autoscaling
3. Predictive Auto Scaling

Not : HPA kubectl autoscale komutu ile de yapılabilir

HPA'nın Amacı Nedir?
Açıklaması şöyle. Böylece yaml dosyasında "replicas" alanı ile sabit bir sayı belirtmeye gerek kalmaz. Artık istenilen minimum ve maximum POD sayısı belirtilir.
A HorizontalPodAutoscaler can be used to increase and decrease the number of Pods for your application based on changes in average resource utilization of your Pods. That’s really useful!

For example, an HPA can create more Pods when CPU utilization exceeds your configured threshold. When utilization drops such that fewer Pods would be able to operate at less than the configured threshold, the HPA will remove Pods. This threshold can be configured as an absolute value, but also as a percentage.
HPA İşini Yapan Bileşenin İsmi Nedir
Açıklaması şöyle. HPA işini HPA Controller yapar
HPA is a component of the Kubernetes that can automatically scale the numbers of pods. The K8s controller that is responsible for auto-scaling is known as Horizontal Controller.

Horizontal scaler scales pods as per the following process:
- Fetch the desired metrics from the pods
- Compute the targeted number of replicas by comparing the fetched metrics value to the targeted metric value.
- Replica count is updated in the scalable resource eg. Deployment
Şeklen şöyle

Bu şekildeki önemli kavramlar
- kube-controller-manager içindeki HPA Controller
- metrics server

Metrics Server Nedir?
Metrics server her pod'dan belirtilen metrikleri toplar. Metrikler CPU, bellek veya başka bir şey de olabilir. Açıklaması şöyle
The Metrics Server polls the Summary API endpoint of the kubelet to collect the resource usage metrics of the containers running in the pods. The HPA controller polls the Metrics API endpoint of the Kubernetes API server every 15 seconds (by default), which it proxies to the Metrics Server. In addition, the HPA controller continuously watches the HorizontalPodAutoscaler resource, which maintains the autoscaler configurations. Next, the HPA controller updates the number of pods in the deployment (or other configured resource) to match the requirements based on the configurations. Finally, the Deployment controller responds to the change by updating the ReplicaSet, which changes the number of pods.
Bir başka açıklama şöyle
How to get started with Horizontal Pod Autoscaling
First, your Kubernetes cluster needs to have Metrics Server deployed and configured.

Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Metrics API can also be accessed by kubectl top, making it easier to debug autoscaling pipelines.

Metrics Server can be installed either directly from YAML manifest or via the official Helm chart.

If you are running minikube (like I do), then you need to enable the addon.

Metric'ler Neledir?
Açıklaması şöyle
The K8s Horizontal Pod Autoscaler is implemented as a control loop that periodically queries the Resource Metrics API for core metrics, through metrics.k8s.io API, like CPU/memory and the Custom Metrics API for application-specific metrics (external.metrics.k8s.io or custom.metrics.k8s.io API. They are provided by “adapter” API servers offered by metrics solution vendors. There are some known solutions, but none of those implementations are officially part of Kubernetes)
Metric'leri sorgulamak için şöyle yaparız
kubectl get 
  --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/myapplication/pods/*/myapplication_api_response_time_avg" 
  | jq .
Metric Hangi Şekilde Belirtilebilir
İki şekilde belirtilebilir
1. Mutlak Değer
2. Yüzde olarak

Memory Metric İle Horizontal Auto Scaling
Örnek 
Elimizde şöyle bir pod olsun
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pressure-api-deployment
spec:
  selector:
    matchLabels:
      app: pressure-api
  replicas: 1
  template:
    metadata:
      labels:
        app: pressure-api
    spec:
      containers:
        - name: pressure-api
          image: ghcr.io/rahulrai-in/dotnet-pressure-api:latest
          ports:
            - containerPort: 80
          resources:
            limits:
              cpu: 500m
              memory: 500Mi
---
apiVersion: v1
kind: Service
metadata:
  name: pressure-api-service
  labels:
    run: php-apache
spec:
  ports:
    - port: 80
  selector:
    app: pressure-api
Şöyle yaparız
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: pressure-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: pressure-api-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 40
Bu HPA'yı izlemek için şöyle yaparız. Yani parametre olarak metadata altındaki name alanı veriliyor.
kubectl get hpa pressure-api-hpa --watch
kubectl get deployment pressure-api-deployment --watch
HPA'yı kapatmak için şöyle yaparız
kubectl delete hpa/pressure-api-hpa

kubectl scale --replicas=2 deployment/pressure-api-deployment
Örnek
Elimizde şöyle bir deployment olsun
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: backend
      annotations:
        prometheus.io/scrape: 'true'
    spec:
      containers:
      - name: backend
        image: spring-boot-hpa
        imagePullPolicy: IfNotPresent
        env:
        - name: ACTIVEMQ_BROKER_URL
          value: "tcp://queue:61616"
        - name: STORE_ENABLED
          value: "false"
        - name: WORKER_ENABLED
          value: "true"
        ports:
        - containerPort: 8080
        livenessProbe:
          initialDelaySeconds: 5
          periodSeconds: 5
          httpGet:
            path: /health
            port: 8080
        resources:
          limits:
            memory: 512Mi
Elimizde şöyle bir service olsun
apiVersion: v1
kind: Service
metadata:
  name: backend
  spec:
    ports:
    - nodePort: 31000
      port: 80
      targetPort: 8080
    selector:
      app: backend
    type: NodePort
Şöyle yaparız
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: spring-boot-hpa
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: backend 
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: messages
      targetAverageValue: 10
Açıklaması şöyle
- You’re using the messages metric to scale your Pods. Kubernetes will trigger the autoscaling when there’re more than ten messages in the queue.
- As a minimum, the deployment should have two Pods. Ten Pods is the upper limit.
CPU İle Horizontal Auto Scaling

CPU ve Memory Metric İle Horizontal Auto Scaling
Örnek 
Şöyle bir deployment olsun
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/part-of: echoserver-app
    app.kubernetes.io/version: 1.0.0-SNAPSHOT
    app.kubernetes.io/name: echoserver
  name: echoserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/part-of: echoserver-app
      app.kubernetes.io/version: 1.0.0-SNAPSHOT
      app.kubernetes.io/name: echoserver
  template:
    metadata:
      labels:
        app.kubernetes.io/part-of: echoserver-app
        app.kubernetes.io/version: 1.0.0-SNAPSHOT
        app.kubernetes.io/name: echoserver
    spec:
      containers:
        - env:
            - name: KUBERNETES_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          image: k8s.gcr.io/echoserver:1.5
          imagePullPolicy: Always
          name: echoserver
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
          resources:
            limits:
              cpu: 10m
              memory: 20Mi
            requests:
              cpu: 5m
              memory: 5Mi
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 2
            periodSeconds: 2
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
Şöyle bir servis olsun
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/name: echoserver
    app.kubernetes.io/part-of: echoserver-app
    app.kubernetes.io/version: 1.0.0-SNAPSHOT
  name: echoserver
spec:
  ports:
    - name: http
      port: 80
      targetPort: 8080
  selector:
    app.kubernetes.io/name: echoserver
    app.kubernetes.io/part-of: echoserver-app
    app.kubernetes.io/version: 1.0.0-SNAPSHOT
  type: ClusterIP

Ingress için şöyle yaparız
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  labels:
    app.kubernetes.io/name: echoserver
    app.kubernetes.io/part-of: echoserver-app
    app.kubernetes.io/version: 1.0.0-SNAPSHOT
  name: echoserver
spec:
  ingressClassName: nginx
  rules:
  - host: echoserver.localdev.me
    http:
      paths:
      - backend:
          service:
            name: echoserver
            port:
              number: 80
        path: /
        pathType: Prefix
status:
  loadBalancer:
    ingress:
    - hostname: localhost
HPA için şöyle yaparız
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  labels:
    app.kubernetes.io/part-of: echoserver-app
    app.kubernetes.io/version: 1.0.0-SNAPSHOT
    app.kubernetes.io/name: echoserver
  name: echoserver
spec:
  maxReplicas: 4
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: echoserver
  targetCPUUtilizationPercentage: 50
SINGLE METRIC COMPUTATION
Örneğin CPU kullanılabilir

MULTIPLE METRICS
Örneğin CPU consumption and Query per second (QPS) kullanılabilir

CUSTOM METRICS
Bir örnek burada


Hiç yorum yok:

Yorum Gönder

Cluster Propotional Autoscaler - ReplicaSet Ekler/Siler

Giriş Açıklaması şöyle CPA aims to horizontally scale the number of Pod replicas based on the cluster’s scale. A common example is DNS ser...