Giriş
Kubernetes ile farklı autoscaler'lar kullanılabiliyor. Açıklaması şöyle
Some of the standard tools used for autoscaling workloads on Kubernetes are Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), Cluster Proportional Autoscaler (CPA) and Cluster Autoscaler (CA)
Diğer autoscaling yöntemleri de şöyle
1. Scheduled Autoscaling
2. Reactive Autoscaling
3. Predictive Auto Scaling
Not : HPA kubectl autoscale komutu ile de yapılabilir
HPA'nın Amacı Nedir?
Açıklaması şöyle. Böylece yaml dosyasında "replicas" alanı ile sabit bir sayı belirtmeye gerek kalmaz. Artık istenilen minimum ve maximum POD sayısı belirtilir.
A HorizontalPodAutoscaler can be used to increase and decrease the number of Pods for your application based on changes in average resource utilization of your Pods. That’s really useful!
For example, an HPA can create more Pods when CPU utilization exceeds your configured threshold. When utilization drops such that fewer Pods would be able to operate at less than the configured threshold, the HPA will remove Pods. This threshold can be configured as an absolute value, but also as a percentage.
HPA İşini Yapan Bileşenin İsmi Nedir
Açıklaması şöyle. HPA işini HPA Controller yapar
HPA is a component of the Kubernetes that can automatically scale the numbers of pods. The K8s controller that is responsible for auto-scaling is known as Horizontal Controller.Horizontal scaler scales pods as per the following process:- Fetch the desired metrics from the pods- Compute the targeted number of replicas by comparing the fetched metrics value to the targeted metric value.- Replica count is updated in the scalable resource eg. Deployment
Şeklen şöyle
Bu şekildeki önemli kavramlar
- kube-controller-manager içindeki HPA Controller
- metrics server
Metrics Server Nedir?
Metrics server her pod'dan belirtilen metrikleri toplar. Metrikler CPU, bellek veya başka bir şey de olabilir. Açıklaması şöyle
The Metrics Server polls the Summary API endpoint of the kubelet to collect the resource usage metrics of the containers running in the pods. The HPA controller polls the Metrics API endpoint of the Kubernetes API server every 15 seconds (by default), which it proxies to the Metrics Server. In addition, the HPA controller continuously watches the HorizontalPodAutoscaler resource, which maintains the autoscaler configurations. Next, the HPA controller updates the number of pods in the deployment (or other configured resource) to match the requirements based on the configurations. Finally, the Deployment controller responds to the change by updating the ReplicaSet, which changes the number of pods.
Bir başka açıklama şöyle
How to get started with Horizontal Pod AutoscalingFirst, your Kubernetes cluster needs to have Metrics Server deployed and configured.Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Metrics API can also be accessed by kubectl top, making it easier to debug autoscaling pipelines.Metrics Server can be installed either directly from YAML manifest or via the official Helm chart.If you are running minikube (like I do), then you need to enable the addon.
Metric'ler Neledir?
Açıklaması şöyle
The K8s Horizontal Pod Autoscaler is implemented as a control loop that periodically queries the Resource Metrics API for core metrics, through metrics.k8s.io API, like CPU/memory and the Custom Metrics API for application-specific metrics (external.metrics.k8s.io or custom.metrics.k8s.io API. They are provided by “adapter” API servers offered by metrics solution vendors. There are some known solutions, but none of those implementations are officially part of Kubernetes)
Metric'leri sorgulamak için şöyle yaparız
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/myapplication/pods/*/myapplication_api_response_time_avg" | jq .
Metric Hangi Şekilde Belirtilebilir
İki şekilde belirtilebilir
1. Mutlak Değer
2. Yüzde olarak
Memory Metric İle Horizontal Auto Scaling
Örnek
Elimizde şöyle bir pod olsun
apiVersion: apps/v1 kind: Deployment metadata: name: pressure-api-deployment spec: selector: matchLabels: app: pressure-api replicas: 1 template: metadata: labels: app: pressure-api spec: containers: - name: pressure-api image: ghcr.io/rahulrai-in/dotnet-pressure-api:latest ports: - containerPort: 80 resources: limits: cpu: 500m memory: 500Mi --- apiVersion: v1 kind: Service metadata: name: pressure-api-service labels: run: php-apache spec: ports: - port: 80 selector: app: pressure-api
Şöyle yaparız
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: pressure-api-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: pressure-api-deployment minReplicas: 1 maxReplicas: 5 metrics: - type: Resource resource: name: memory target: type: Utilization averageUtilization: 40
Bu HPA'yı izlemek için şöyle yaparız. Yani parametre olarak metadata altındaki name alanı veriliyor.
kubectl get hpa pressure-api-hpa --watch kubectl get deployment pressure-api-deployment --watch
HPA'yı kapatmak için şöyle yaparız
kubectl delete hpa/pressure-api-hpa kubectl scale --replicas=2 deployment/pressure-api-deployment
Örnek
Elimizde şöyle bir deployment olsun
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: backend spec: replicas: 1 template: metadata: labels: app: backend annotations: prometheus.io/scrape: 'true' spec: containers: - name: backend image: spring-boot-hpa imagePullPolicy: IfNotPresent env: - name: ACTIVEMQ_BROKER_URL value: "tcp://queue:61616" - name: STORE_ENABLED value: "false" - name: WORKER_ENABLED value: "true" ports: - containerPort: 8080 livenessProbe: initialDelaySeconds: 5 periodSeconds: 5 httpGet: path: /health port: 8080 resources: limits: memory: 512Mi
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: spring-boot-hpa spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: backend minReplicas: 1 maxReplicas: 10 metrics: - type: Pods pods: metricName: messages targetAverageValue: 10
Açıklaması şöyle
- You’re using the messages metric to scale your Pods. Kubernetes will trigger the autoscaling when there’re more than ten messages in the queue.- As a minimum, the deployment should have two Pods. Ten Pods is the upper limit.
CPU İle Horizontal Auto Scaling
CPU İle Horizontal Auto Scaling yazına taşıdım
CPU ve Memory Metric İle Horizontal Auto Scaling
Örnek
Şöyle bir deployment olsun
apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT app.kubernetes.io/name: echoserver name: echoserver spec: replicas: 1 selector: matchLabels: app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT app.kubernetes.io/name: echoserver template: metadata: labels: app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT app.kubernetes.io/name: echoserver spec: containers: - env: - name: KUBERNETES_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: k8s.gcr.io/echoserver:1.5 imagePullPolicy: Always name: echoserver ports: - containerPort: 8080 name: http protocol: TCP resources: limits: cpu: 10m memory: 20Mi requests: cpu: 5m memory: 5Mi readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 2 periodSeconds: 2 livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10
Şöyle bir servis olsun
apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/name: echoserver app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT name: echoserver spec: ports: - name: http port: 80 targetPort: 8080 selector: app.kubernetes.io/name: echoserver app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT type: ClusterIP
Ingress için şöyle yaparız
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: labels: app.kubernetes.io/name: echoserver app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT name: echoserver spec: ingressClassName: nginx rules: - host: echoserver.localdev.me http: paths: - backend: service: name: echoserver port: number: 80 path: / pathType: Prefix status: loadBalancer: ingress: - hostname: localhost
HPA için şöyle yaparız
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: labels: app.kubernetes.io/part-of: echoserver-app app.kubernetes.io/version: 1.0.0-SNAPSHOT app.kubernetes.io/name: echoserver name: echoserver spec: maxReplicas: 4 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: echoserver targetCPUUtilizationPercentage: 50
SINGLE METRIC COMPUTATION
Örneğin CPU kullanılabilir
MULTIPLE METRICS
Örneğin CPU consumption and Query per second (QPS) kullanılabilir
CUSTOM METRICS
Bir örnek burada
Hiç yorum yok:
Yorum Gönder