1. 개요
CentOS 7에 Kubernetes 를 설치하고 설치가 잘 되었는지 컴포넌트들의 상태를 확인하였는데,
스케쥴러와 컨트롤러의 상태가 Unhealthy인 것이 발견되었다.
# kubectl get componentstatus Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused etcd-0 Healthy {"health":"true"}
2. 중간 점검
: Connect 하려는 포트들의 러닝 상태와 pod의 러닝상태들을 확인해보았다.
# netstat -an | grep 1025 tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN tcp6 0 0 :::10250 :::* LISTEN tcp6 0 0 :::10256 :::* LISTEN
# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE .... 생략 kube-controller-manager-lncjhap2 1/1 Running 0 47s kube-scheduler-lncjhap2 1/1 Running 0 73s .... 생략
# netstat -atnp | grep -i listen tcp 0 0 0.0.0.0:30119 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 0.0.0.0:16903 0.0.0.0:* LISTEN 3583/java tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 30950/kubelet tcp 0 0 127.0.0.1:7625 0.0.0.0:* LISTEN 21382/java tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 10.81.208.160:2379 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 10.81.208.160:2380 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 127.0.0.1:2381 0.0.0.0:* LISTEN 4435/etcd tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN 31336/kube-controll tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN 30731/kube-schedule tcp 0 0 10.81.208.160:179 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 0.0.0.0:7700 0.0.0.0:* LISTEN 21382/java tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 972/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1482/master tcp 0 0 127.0.0.1:45116 0.0.0.0:* LISTEN 30950/kubelet tcp 0 0 0.0.0.0:16800 0.0.0.0:* LISTEN 21948/java tcp 0 0 127.0.0.1:50051 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 10.81.208.160:50051 0.0.0.0:* LISTEN 10162/kube-router tcp 0 0 0.0.0.0:16900 0.0.0.0:* LISTEN 32763/java tcp 0 0 0.0.0.0:16100 0.0.0.0:* LISTEN 21382/java tcp 0 0 0.0.0.0:32421 0.0.0.0:* LISTEN 5533/kube-proxy tcp 0 0 0.0.0.0:16901 0.0.0.0:* LISTEN 8706/java tcp6 0 0 :::10250 :::* LISTEN 30950/kubelet tcp6 0 0 :::6443 :::* LISTEN 4653/kube-apiserver tcp6 0 0 :::7180 :::* LISTEN 25466/httpd tcp6 0 0 :::7280 :::* LISTEN 26284/httpd tcp6 0 0 :::10256 :::* LISTEN 5533/kube-proxy tcp6 0 0 ::1:179 :::* LISTEN 10162/kube-router tcp6 0 0 :::20244 :::* LISTEN 10162/kube-router tcp6 0 0 :::22 :::* LISTEN 972/sshd tcp6 0 0 ::1:25 :::* LISTEN 1482/master
일단, 스케쥴러와 컨트롤러의 포트는 리스닝 하는 것으로 보인다.
3. 구글링의 시작
찾다보니, 스케쥴러와 컨트롤러의 yaml 파일에서 "–port=0" 를 제거해보라는 가이드가 있었다. 바로 실행 ㄱㄱ!
: 참고url : https://github.com/kubernetes/kubernetes/issues/93342
4. 컨트롤러, 스케쥴러의 yaml 파일에 구글링을 통해 얻은 정보 반영
* yaml 파일 경로 : /etc/kubernetes/manifests
1) kube-controller-manager.yaml 수정 : 26번째 라인의 "- --port=0" 항목 주석처리
apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-controller-manager tier: control-plane name: kube-controller-manager namespace: kube-system spec: containers: - command: - kube-controller-manager - --allocate-node-cidrs=true - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf - --bind-address=127.0.0.1 - --client-ca-file=/etc/kubernetes/pki/ca.crt - --cluster-cidr=10.81.0.0/16 - --cluster-name=kubernetes - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key - --controllers=*,bootstrapsigner,tokencleaner - --kubeconfig=/etc/kubernetes/controller-manager.conf - --leader-elect=true # - --port=0 - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --root-ca-file=/etc/kubernetes/pki/ca.crt - --service-account-private-key-file=/etc/kubernetes/pki/sa.key - --service-cluster-ip-range=10.96.0.0/12 - --use-service-account-credentials=true image: k8s.gcr.io/kube-controller-manager:v1.20.6 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 10257 scheme: HTTPS
2) kube-scheduler.yaml 수정 : 19번째 라인의 "- --port=0" 항목 주석처리
apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true #- --port=0 image: k8s.gcr.io/kube-scheduler:v1.20.6 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10
5. kubelet 재기동하여 설정 변경사항 적용
# systemctl restart kubelet.service
6. 적용 결과 확인
# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"}