1. 개요
CentOS 7에 Kubernetes 를 설치하고 설치가 잘 되었는지 컴포넌트들의 상태를 확인하였는데,
스케쥴러와 컨트롤러의 상태가 Unhealthy인 것이 발견되었다.
# kubectl get componentstatus Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused etcd-0 Healthy {"health":"true"}
2. 중간 점검
: Connect 하려는 포트들의 러닝 상태와 pod의 러닝상태들을 확인해보았다.
# netstat -an | grep 1025 tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN tcp6 0 0 :::10250 :::* LISTEN tcp6 0 0 :::10256 :::* LISTEN
# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE .... 생략 kube-controller-manager-lncjhap2 1/1 Running 0 47s kube-scheduler-lncjhap2 1/1 Running 0 73s .... 생략
# netstat -atnp | grep -i listen tcp 0 0 0.0.0.0:30119 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 0.0.0.0:16903 0.0.0.0:* LISTEN 3583/java tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 30950/kubelet tcp 0 0 127.0.0.1:7625 0.0.0.0:* LISTEN 21382/java tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 10.81.208.160:2379 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 10.81.208.160:2380 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 127.0.0.1:2381 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN 31336/kube-controll tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN 30731/kube-schedule tcp 0 0 10.81.208.160:179 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 0.0.0.0:7700 0.0.0.0:* LISTEN 21382/java tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 972/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1482/master tcp 0 0 127.0.0.1:45116 0.0.0.0:* LISTEN 30950/kubelet tcp 0 0 0.0.0.0:16800 0.0.0.0:* LISTEN 21948/java tcp 0 0 127.0.0.1:50051 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 10.81.208.160:50051 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 0.0.0.0:16900 0.0.0.0:* LISTEN 32763/java tcp 0 0 0.0.0.0:16100 0.0.0.0:* LISTEN 21382/java tcp 0 0 0.0.0.0:32421 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 0.0.0.0:16901 0.0.0.0:* LISTEN 8706/java tcp6 0 0 :::10250 :::* LISTEN 30950/kubelet tcp6 0 0 :::6443 :::* LISTEN 4653/kube-apiserver tcp6 0 0 :::7180 :::* LISTEN 25466/httpd tcp6 0 0 :::7280 :::* LISTEN 26284/httpd tcp6 0 0 :::10256 :::* LISTEN 5533/kube-proxy tcp6 0 0 ::1:179 :::* LISTEN 10162/kube-router tcp6 0 0 :::20244 :::* LISTEN 10162/kube-router tcp6 0 0 :::22 :::* LISTEN 972/sshd tcp6 0 0 ::1:25 :::* LISTEN 1482/master
일단, 스케쥴러와 컨트롤러의 포트는 리스닝 하는 것으로 보인다.
3. 구글링의 시작
찾다보니, 스케쥴러와 컨트롤러의 yaml 파일에서 "–port=0" 를 제거해보라는 가이드가 있었다. 바로 실행 ㄱㄱ!
: 참고url : https://github.com/kubernetes/kubernetes/issues/93342
4. 컨트롤러, 스케쥴러의 yaml 파일에 구글링을 통해 얻은 정보 반영
* yaml 파일 경로 : /etc/kubernetes/manifests
1) kube-controller-manager.yaml 수정 : 26번째 라인의 "- --port=0" 항목 주석처리
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.81.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
# - --port=0
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
image: k8s.gcr.io/kube-controller-manager:v1.20.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10257
scheme: HTTPS
2) kube-scheduler.yaml 수정 : 19번째 라인의 "- --port=0" 항목 주석처리
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
#- --port=0
image: k8s.gcr.io/kube-scheduler:v1.20.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
5. kubelet 재기동하여 설정 변경사항 적용
# systemctl restart kubelet.service
6. 적용 결과 확인
# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"}
