Giriş
Kubernetes ile farklı autoscaler'lar kullanılabiliyor. Açıklaması şöyle
Some of the standard tools used for autoscaling workloads on Kubernetes are Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), Cluster Proportional Autoscaler (CPA) and Cluster Autoscaler (CA)
Diğer autoscaling yöntemleri de şöyle
1. Scheduled Autoscaling
2. Reactive Autoscaling
3. Predictive Auto Scaling
Not : HPA kubectl autoscale komutu ile de yapılabilir
HPA'nın Amacı Nedir?
Açıklaması şöyle. Böylece yaml dosyasında "replicas" alanı ile sabit bir sayı belirtmeye gerek kalmaz. Artık istenilen minimum ve maximum POD sayısı belirtilir.
A HorizontalPodAutoscaler can be used to increase and decrease the number of Pods for your application based on changes in average resource utilization of your Pods. That’s really useful!
For example, an HPA can create more Pods when CPU utilization exceeds your configured threshold. When utilization drops such that fewer Pods would be able to operate at less than the configured threshold, the HPA will remove Pods. This threshold can be configured as an absolute value, but also as a percentage.
HPA İşini Yapan Bileşenin İsmi Nedir
Açıklaması şöyle. HPA işini HPA Controller yapar
HPA is a component of the Kubernetes that can automatically scale the numbers of pods. The K8s controller that is responsible for auto-scaling is known as Horizontal Controller.Horizontal scaler scales pods as per the following process:- Fetch the desired metrics from the pods- Compute the targeted number of replicas by comparing the fetched metrics value to the targeted metric value.- Replica count is updated in the scalable resource eg. Deployment
Şeklen şöyle
Bu şekildeki önemli kavramlar
- kube-controller-manager içindeki HPA Controller
- metrics server
Metrics Server Nedir?
Metrics server her pod'dan belirtilen metrikleri toplar. Metrikler CPU, bellek veya başka bir şey de olabilir. Açıklaması şöyle
The Metrics Server polls the Summary API endpoint of the kubelet to collect the resource usage metrics of the containers running in the pods. The HPA controller polls the Metrics API endpoint of the Kubernetes API server every 15 seconds (by default), which it proxies to the Metrics Server. In addition, the HPA controller continuously watches the HorizontalPodAutoscaler resource, which maintains the autoscaler configurations. Next, the HPA controller updates the number of pods in the deployment (or other configured resource) to match the requirements based on the configurations. Finally, the Deployment controller responds to the change by updating the ReplicaSet, which changes the number of pods.
Bir başka açıklama şöyle
How to get started with Horizontal Pod AutoscalingFirst, your Kubernetes cluster needs to have Metrics Server deployed and configured.Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Metrics API can also be accessed by kubectl top, making it easier to debug autoscaling pipelines.Metrics Server can be installed either directly from YAML manifest or via the official Helm chart.If you are running minikube (like I do), then you need to enable the addon.
Metric'ler Neledir?
Açıklaması şöyle
The K8s Horizontal Pod Autoscaler is implemented as a control loop that periodically queries the Resource Metrics API for core metrics, through metrics.k8s.io API, like CPU/memory and the Custom Metrics API for application-specific metrics (external.metrics.k8s.io or custom.metrics.k8s.io API. They are provided by “adapter” API servers offered by metrics solution vendors. There are some known solutions, but none of those implementations are officially part of Kubernetes)
Metric'leri sorgulamak için şöyle yaparız
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/myapplication/pods/*/myapplication_api_response_time_avg" | jq .
Metric Hangi Şekilde Belirtilebilir
İki şekilde belirtilebilir
1. Mutlak Değer
2. Yüzde olarak
Memory Metric İle Horizontal Auto Scaling
Örnek
Elimizde şöyle bir pod olsun
apiVersion: apps/v1
kind: Deployment
metadata:
name: pressure-api-deployment
spec:
selector:
matchLabels:
app: pressure-api
replicas: 1
template:
metadata:
labels:
app: pressure-api
spec:
containers:
- name: pressure-api
image: ghcr.io/rahulrai-in/dotnet-pressure-api:latest
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
memory: 500Mi
---
apiVersion: v1
kind: Service
metadata:
name: pressure-api-service
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
app: pressure-apiŞöyle yaparız
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: pressure-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: pressure-api-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 40Bu HPA'yı izlemek için şöyle yaparız. Yani parametre olarak metadata altındaki name alanı veriliyor.
kubectl get hpa pressure-api-hpa --watch kubectl get deployment pressure-api-deployment --watch
HPA'yı kapatmak için şöyle yaparız
kubectl delete hpa/pressure-api-hpa kubectl scale --replicas=2 deployment/pressure-api-deployment
Örnek
Elimizde şöyle bir deployment olsun
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: backend
spec:
replicas: 1
template:
metadata:
labels:
app: backend
annotations:
prometheus.io/scrape: 'true'
spec:
containers:
- name: backend
image: spring-boot-hpa
imagePullPolicy: IfNotPresent
env:
- name: ACTIVEMQ_BROKER_URL
value: "tcp://queue:61616"
- name: STORE_ENABLED
value: "false"
- name: WORKER_ENABLED
value: "true"
ports:
- containerPort: 8080
livenessProbe:
initialDelaySeconds: 5
periodSeconds: 5
httpGet:
path: /health
port: 8080
resources:
limits:
memory: 512Mi
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: spring-boot-hpa
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: backend
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: messages
targetAverageValue: 10
Açıklaması şöyle
- You’re using the messages metric to scale your Pods. Kubernetes will trigger the autoscaling when there’re more than ten messages in the queue.- As a minimum, the deployment should have two Pods. Ten Pods is the upper limit.
CPU İle Horizontal Auto Scaling
CPU İle Horizontal Auto Scaling yazına taşıdım
CPU ve Memory Metric İle Horizontal Auto Scaling
Örnek
Şöyle bir deployment olsun
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
app.kubernetes.io/name: echoserver
name: echoserver
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
app.kubernetes.io/name: echoserver
template:
metadata:
labels:
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
app.kubernetes.io/name: echoserver
spec:
containers:
- env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: k8s.gcr.io/echoserver:1.5
imagePullPolicy: Always
name: echoserver
ports:
- containerPort: 8080
name: http
protocol: TCP
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 5m
memory: 5Mi
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 2
periodSeconds: 2
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
Şöyle bir servis olsun
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: echoserver
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
name: echoserver
spec:
ports:
- name: http
port: 80
targetPort: 8080
selector:
app.kubernetes.io/name: echoserver
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
type: ClusterIP
Ingress için şöyle yaparız
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
labels:
app.kubernetes.io/name: echoserver
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
name: echoserver
spec:
ingressClassName: nginx
rules:
- host: echoserver.localdev.me
http:
paths:
- backend:
service:
name: echoserver
port:
number: 80
path: /
pathType: Prefix
status:
loadBalancer:
ingress:
- hostname: localhost
HPA için şöyle yaparız
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
labels:
app.kubernetes.io/part-of: echoserver-app
app.kubernetes.io/version: 1.0.0-SNAPSHOT
app.kubernetes.io/name: echoserver
name: echoserver
spec:
maxReplicas: 4
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: echoserver
targetCPUUtilizationPercentage: 50SINGLE METRIC COMPUTATION
Örneğin CPU kullanılabilir
MULTIPLE METRICS
Örneğin CPU consumption and Query per second (QPS) kullanılabilir
CUSTOM METRICS
Bir örnek burada
Hiç yorum yok:
Yorum Gönder