Out of the box, metrics-server doesn’t work on a kubeadm managed k8s server. When deployed out of the box, the associated pod goes into a CrashLoopBackOff state.
[dwai@k8s01 base]$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-74ff55c5b-dx5pg 1/1 Running 0 87m
coredns-74ff55c5b-tmm6z 1/1 Running 0 87m
etcd-k8s01 1/1 Running 0 105m
etcd-k8s02 1/1 Running 0 98m
etcd-k8s03 1/1 Running 0 92m
haproxy-k8s01 1/1 Running 388 106d
keepalived-k8s01 1/1 Running 1 106d
kube-apiserver-k8s01 1/1 Running 0 105m
kube-apiserver-k8s02 1/1 Running 0 98m
kube-apiserver-k8s03 1/1 Running 0 92m
kube-controller-manager-k8s01 1/1 Running 0 105m
kube-controller-manager-k8s02 1/1 Running 0 98m
kube-controller-manager-k8s03 1/1 Running 0 92m
kube-flannel-ds-2547z 1/1 Running 3 106d
kube-flannel-ds-6ql5v 1/1 Running 1 106d
kube-flannel-ds-crzmt 1/1 Running 3 106d
kube-flannel-ds-hgwzb 1/1 Running 1 106d
kube-flannel-ds-knh9t 1/1 Running 1 106d
kube-proxy-49rls 1/1 Running 0 104m
kube-proxy-758l7 1/1 Running 0 104m
kube-proxy-kl87z 1/1 Running 0 104m
kube-proxy-ljd5b 1/1 Running 0 104m
kube-proxy-xzmbj 1/1 Running 0 104m
kube-scheduler-k8s01 1/1 Running 0 105m
kube-scheduler-k8s02 1/1 Running 0 98m
kube-scheduler-k8s03 1/1 Running 0 92m
metrics-server-76f8d9fc69-6r5qv 0/1 CrashLoopBackOff 4 2m32s
[dwai@k8s01 base]$ kubectl describe po metrics-server-76f8d9fc69-6r5qv -n kube-system
Name: metrics-server-76f8d9fc69-6r5qv
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: k8s04/192.168.7.14
Start Time: Thu, 25 Feb 2021 12:40:44 -0500
Labels: k8s-app=metrics-server
pod-template-hash=76f8d9fc69
Annotations: <none>
Status: Running
IP: 10.244.3.19
IPs:
IP: 10.244.3.19
Controlled By: ReplicaSet/metrics-server-76f8d9fc69
Containers:
metrics-server:
Container ID: docker://70686b73434fd7910929cd64d538d5deb430a6ac3d16b7f4fedfa6daca297d1e
Image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2
Image ID: docker-pullable://k8s.gcr.io/metrics-server/metrics-server@sha256:dbc33d7d35d2a9cc5ab402005aa7a0d13be6192f3550c7d42cba8d2d5e3a5d62
Port: 4443/TCP
Host Port: 0/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 25 Feb 2021 12:42:36 -0500
Finished: Thu, 25 Feb 2021 12:43:06 -0500
Ready: False
Restart Count: 4
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-dqjwd (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
metrics-server-token-dqjwd:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-token-dqjwd
Optional: false
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m54s default-scheduler Successfully assigned kube-system/metrics-server-76f8d9fc69-6r5qv to k8s04
Warning Unhealthy 2m53s kubelet Liveness probe failed: Get "https://10.244.3.19:4443/livez": dial tcp 10.244.3.19:4443: connect: connection refused
Normal Pulled 2m3s (x3 over 2m54s) kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.2" already present on machine
Normal Created 2m3s (x3 over 2m54s) kubelet Created container metrics-server
Normal Started 2m3s (x3 over 2m54s) kubelet Started container metrics-server
Normal Killing 2m3s (x2 over 2m33s) kubelet Container metrics-server failed liveness probe, will be restarted
Warning Unhealthy 113s (x6 over 2m43s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 111s (x7 over 2m51s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
In order to fix this issue, we have to edit the deployment and add a few parameters —
[dwai@k8s01 base]$ kubectl edit deploy -n kube-system metrics-server
Following relevant section of the deployment file is shown in a “before/after” state below —
BEFORE:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
AFTER:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalDNS,InternalIP,ExternalDNS,ExternalIP,Hostname
- --kubelet-insecure-tls
- --kubelet-use-node-status-port
[dwai@k8s01 base]$ kubectl edit deploy -n kube-system metrics-server
deployment.apps/metrics-server edited
[dwai@k8s01 base]$ kubectl get deploy -n kube-system metrics-server
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 7m24s
[dwai@k8s01 base]$ kubectl get deploy -n kube-system metrics-server
[dwai@k8s01 base]$ kubectl describe po -n kube-system metrics-server-5c45b9d77b-ffc9p
Name: metrics-server-5c45b9d77b-ffc9p
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: k8s04/192.168.7.14
Start Time: Thu, 25 Feb 2021 12:47:56 -0500
Labels: k8s-app=metrics-server
pod-template-hash=5c45b9d77b
Annotations: <none>
Status: Running
IP: 10.244.3.20
IPs:
IP: 10.244.3.20
Controlled By: ReplicaSet/metrics-server-5c45b9d77b
Containers:
metrics-server:
Container ID: docker://4b5df197209f49a96b7d7b476a62a1119898bedef4ee5b4142415b53e4bbff3c
Image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2
Image ID: docker-pullable://k8s.gcr.io/metrics-server/metrics-server@sha256:dbc33d7d35d2a9cc5ab402005aa7a0d13be6192f3550c7d42cba8d2d5e3a5d62
Port: 4443/TCP
Host Port: 0/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalDNS,InternalIP,ExternalDNS,ExternalIP,Hostname
--kubelet-insecure-tls
--kubelet-use-node-status-port
State: Running
Started: Thu, 25 Feb 2021 12:47:57 -0500
Ready: True
Restart Count: 0
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-dqjwd (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
metrics-server-token-dqjwd:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-token-dqjwd
Optional: false
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 57s default-scheduler Successfully assigned kube-system/metrics-server-5c45b9d77b-ffc9p to k8s04
Normal Pulled 57s kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.2" already present on machine
Normal Created 57s kubelet Created container metrics-server
Normal Started 57s kubelet Started container metrics-server
Now we can query the metrics-server and see if it returns any data —
[dwai@k8s01 base]$ kubectl get --raw /apis/metrics.k8s.io/
{"kind":"APIGroup","apiVersion":"v1","name":"metrics.k8s.io","versions":[{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}}
Look at the top resource hogs —
[dwai@k8s01 base]$ kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-74ff55c5b-dx5pg 3m 12Mi
coredns-74ff55c5b-tmm6z 3m 12Mi
etcd-k8s01 31m 80Mi
etcd-k8s02 35m 83Mi
etcd-k8s03 50m 84Mi
haproxy-k8s01 2m 68Mi
keepalived-k8s01 1m 108Mi
kube-apiserver-k8s01 56m 313Mi
kube-apiserver-k8s02 50m 239Mi
kube-apiserver-k8s03 50m 261Mi
kube-controller-manager-k8s01 16m 76Mi
kube-controller-manager-k8s02 2m 27Mi
kube-controller-manager-k8s03 2m 25Mi
kube-flannel-ds-2547z 2m 14Mi
kube-flannel-ds-6ql5v 2m 28Mi
kube-flannel-ds-crzmt 2m 14Mi
kube-flannel-ds-hgwzb 2m 16Mi
kube-flannel-ds-knh9t 3m 15Mi
kube-proxy-49rls 1m 18Mi
kube-proxy-758l7 1m 17Mi
kube-proxy-kl87z 1m 17Mi
kube-proxy-ljd5b 1m 16Mi
kube-proxy-xzmbj 1m 11Mi
kube-scheduler-k8s01 3m 17Mi
kube-scheduler-k8s02 3m 22Mi
kube-scheduler-k8s03 2m 21Mi
metrics-server-5c45b9d77b-ffc9p 3m 18Mi
Top sorted by CPU/Memory
[dwai@k8s01 base]$ kubectl top pod --sort-by cpu -n kube-system
NAME CPU(cores) MEMORY(bytes)
kube-apiserver-k8s01 63m 313Mi
kube-apiserver-k8s02 51m 239Mi
etcd-k8s03 51m 84Mi
kube-apiserver-k8s03 47m 261Mi
etcd-k8s02 37m 83Mi
etcd-k8s01 36m 80Mi
kube-controller-manager-k8s01 20m 76Mi
metrics-server-5c45b9d77b-ffc9p 4m 18Mi
kube-scheduler-k8s02 3m 22Mi
kube-scheduler-k8s01 3m 17Mi
coredns-74ff55c5b-tmm6z 3m 12Mi
kube-flannel-ds-2547z 3m 14Mi
coredns-74ff55c5b-dx5pg 3m 12Mi
haproxy-k8s01 2m 68Mi
kube-flannel-ds-6ql5v 2m 28Mi
kube-flannel-ds-crzmt 2m 14Mi
kube-flannel-ds-hgwzb 2m 16Mi
kube-flannel-ds-knh9t 2m 15Mi
kube-scheduler-k8s03 2m 21Mi
kube-controller-manager-k8s02 2m 27Mi
kube-controller-manager-k8s03 2m 25Mi
kube-proxy-ljd5b 1m 16Mi
kube-proxy-xzmbj 1m 11Mi
keepalived-k8s01 1m 108Mi
kube-proxy-kl87z 1m 17Mi
kube-proxy-49rls 1m 18Mi
kube-proxy-758l7 1m 17Mi
[dwai@k8s01 base]$ kubectl top pod --sort-by memory -n kube-system
NAME CPU(cores) MEMORY(bytes)
kube-apiserver-k8s01 63m 313Mi
kube-apiserver-k8s03 47m 261Mi
kube-apiserver-k8s02 51m 239Mi
keepalived-k8s01 1m 108Mi
etcd-k8s03 51m 84Mi
etcd-k8s02 37m 83Mi
etcd-k8s01 36m 80Mi
kube-controller-manager-k8s01 20m 76Mi
haproxy-k8s01 2m 68Mi
kube-flannel-ds-6ql5v 2m 28Mi
kube-controller-manager-k8s02 2m 27Mi
kube-controller-manager-k8s03 2m 25Mi
kube-scheduler-k8s02 3m 22Mi
kube-scheduler-k8s03 2m 21Mi
metrics-server-5c45b9d77b-ffc9p 4m 18Mi
kube-proxy-49rls 1m 18Mi
kube-scheduler-k8s01 3m 17Mi
kube-proxy-758l7 1m 17Mi
kube-proxy-kl87z 1m 17Mi
kube-proxy-ljd5b 1m 16Mi
kube-flannel-ds-hgwzb 2m 16Mi
kube-flannel-ds-knh9t 2m 15Mi
kube-flannel-ds-2547z 3m 14Mi
kube-flannel-ds-crzmt 2m 14Mi
coredns-74ff55c5b-dx5pg 3m 12Mi
coredns-74ff55c5b-tmm6z 3m 12Mi
kube-proxy-xzmbj 1m 11Mi