- Node 상태 확인
- Controlplane Pods 및 Services 확인
$ kubectl get pods -n kube-system
$ service kube-apiserver status
$ service kube-controller-manager status
$ service kube-scheduler status
$ service kubelet status
$ service kube-proxy statu
- Service Logs 확인
$ kubectl logs kube-apiserver-master -n kube-system
$ sudo journalctl -u kube-apiserver
[실습 1] - kube-scheduler
controlplane ~ ➜ kubectl get pods
NAME READY STATUS RESTARTS AGE
app-5646649cc9-74hw9 0/1 Pending 0 2m1s
controlplane ~ ➜ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controlplane Ready control-plane 8m25s v1.29.0
controlplane ~ ➜ kubectl get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
app 0/1 1 0 2m21s
controlplane ~ ➜ kubectl describe pod app-5646649cc9-74hw9
Name: app-5646649cc9-74hw9
Namespace: default
Priority: 0
Service Account: default
Node: <none>
Labels: app=app
pod-template-hash=5646649cc9
Annotations: <none>
Status: Pending
# 노드가 아직 할당이 안돼서 Pending 상태임을 확인
controlplane ~ ➜ k get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 13m
coredns-69f9c977-trp5s 1/1 Running 0 13m
etcd-controlplane 1/1 Running 0 13m
kube-apiserver-controlplane 1/1 Running 0 13m
kube-controller-manager-controlplane 1/1 Running 0 13m
kube-proxy-2bnj5 1/1 Running 0 13m
kube-scheduler-controlplane 0/1 CrashLoopBackOff 6 (64s ago) 7m3s # 여기가 문제 Point
controlplane ~ ➜ kubectl describe pod -n kube-system kube-scheduler-controlplane
Name: kube-scheduler-controlplane
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: controlplane/192.6.220.6
...
Containers:
kube-scheduler:
Container ID: containerd://4663c870f13fb6dbe920ece454ab7c4d35a02a1d1afefc51f902ff2b240cdf7e
Image: registry.k8s.io/kube-scheduler:v1.29.0
Image ID: registry.k8s.io/kube-scheduler@sha256:5df310234e4f9463b15d166778d697830a51c0037ff28a1759daaad2d3cde991
Port: <none>
Host Port: <none>
Command:
kube-schedulerrrr
--authentication-kubeconfig=/etc/kubernetes/scheduler.conf
--authorization-kubeconfig=/etc/kubernetes/scheduler.conf
--bind-address=127.0.0.1
--kubeconfig=/etc/kubernetes/scheduler.conf
--leader-elect=true
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: StartError
Message: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "kube-schedulerrrr": executable file not found in $PATH: unknown
Exit Code: 128
Started: Thu, 01 Jan 1970 00:00:00 +0000
Finished: Sun, 28 Apr 2024 05:02:53 +0000
Ready: False
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 6m18s (x5 over 8m) kubelet Container image "registry.k8s.io/kube-scheduler:v1.29.0" already present on machine
Normal Created 6m18s (x5 over 8m) kubelet Created container kube-scheduler
Warning Failed 6m17s (x5 over 8m) kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "kube-schedulerrrr": executable file not found in $PATH: unknown
Warning BackOff 2m55s (x28 over 7m59s) kubelet Back-off restarting failed container kube-scheduler in pod kube-scheduler-controlplane_kube-system(23ace3c8b1dea5b6d6b30e6bcbb810a7)
# Event를 보니 문법에 오류가 있는 듯 싶어 위로 올라가보면,
# Command에 kube-schedulerrrr라고 되어있음
# 여기서 수정 가능하다.
controlplane ~ ➜ cat /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-schedulerrrr
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
controlplane ~ ➜ vi /etc/kubernetes/manifests/kube-scheduler.yaml
controlplane ~ ➜ kubectl get pods -n kube-system --watch
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 17m
coredns-69f9c977-trp5s 1/1 Running 0 17m
etcd-controlplane 1/1 Running 0 17m
kube-apiserver-controlplane 1/1 Running 0 17m
kube-controller-manager-controlplane 1/1 Running 0 17m
kube-proxy-2bnj5 1/1 Running 0 17m
kube-scheduler-controlplane 0/1 Pending 0 0s
kube-scheduler-controlplane 0/1 ContainerCreating 0 0s
kube-scheduler-controlplane 0/1 Running 0 2s
# 정상 작동 확인
controlplane ~ ➜ kubectl get pods
NAME READY STATUS RESTARTS AGE
app-5646649cc9-74hw9 1/1 Running 0 11m
[실습 - 2]
Scale the deployment app to 2 pods
controlplane ~ ➜ k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
app 1/1 1 1 13m
controlplane ~ ➜ kubectl scale deployment app --replicas=2
deployment.apps/app scaled
controlplane ~ ➜ k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
app 1/2 1 1 13m
Even though the deployment was scaled to 2, the number of PODs does not seem to increase. Investigate and fix the issue.
Inspect the component responsible for managing deployments and replicasets.
controlplane ~ ➜ kubectl describe deployments.apps app
Name: app
Namespace: default
CreationTimestamp: Sun, 28 Apr 2024 04:56:56 +0000
Labels: app=app
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=app
Replicas: 2 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=app
Containers:
nginx:
Image: nginx:alpine
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: app-5646649cc9 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 15m deployment-controller Scaled up replica set app-5646649cc9 to 1
# replica set이 2가 아닌 초기 1로만 되어있다.
# 이러한 replica set의 scaling up 작업은 controller manager가 진행한다.
controlplane ~ ➜ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 30m
coredns-69f9c977-trp5s 1/1 Running 0 30m
etcd-controlplane 1/1 Running 0 30m
kube-apiserver-controlplane 1/1 Running 0 30m
kube-controller-manager-controlplane 0/1 CrashLoopBackOff 7 (47s ago) 11m
kube-proxy-2bnj5 1/1 Running 0 30m
kube-scheduler-controlplane 1/1 Running 0 13m
controlplane ~ ➜ kubectl logs kube-controller-manager-controlplane -n kube-system
I0428 05:20:56.670420 1 serving.go:380] Generated self-signed cert in-memory
E0428 05:20:56.670576 1 run.go:74] "command failed" err="stat /etc/kubernetes/controller-manager-XXXX.conf: no such file or directory"
controlplane ~ ➜ ls /etc/kubernetes/controller-manager-XXXX.conf
ls: cannot access '/etc/kubernetes/controller-manager-XXXX.conf': No such file or directory
# static이기 때문에 위치는 여기다.
controlplane ~ ✖ cat /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
controlplane ~ ➜ cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep XXX
- --kubeconfig=/etc/kubernetes/controller-manager-XXXX.conf
controlplane ~ ➜ ls /etc/kubernetes/controlle
ls: cannot access '/etc/kubernetes/controlle': No such file or directory
controlplane ~ ✖ ls /etc/kubernetes/controller-manager.conf
/etc/kubernetes/controller-manager.conf
# - --kubeconfig=/etc/kubernetes/controller-manager.conf로 바꿔준다.
controlplane ~ ➜ vi /etc/kubernetes/manifests/kube-controller-manager.yaml
controlplane ~ ➜ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 36m
coredns-69f9c977-trp5s 1/1 Running 0 36m
etcd-controlplane 1/1 Running 0 36m
kube-apiserver-controlplane 1/1 Running 0 36m
kube-proxy-2bnj5 1/1 Running 0 36m
kube-scheduler-controlplane 1/1 Running 0 19m
controlplane ~ ➜ kubectl get pods -n kube-system --watch
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 36m
coredns-69f9c977-trp5s 1/1 Running 0 36m
etcd-controlplane 1/1 Running 0 37m
kube-apiserver-controlplane 1/1 Running 0 37m
kube-controller-manager-controlplane 0/1 Running 0 5s
kube-proxy-2bnj5 1/1 Running 0 36m
kube-scheduler-controlplane 1/1 Running 0 19m
kube-controller-manager-controlplane 0/1 Running 0 11s
kube-controller-manager-controlplane 1/1 Running 0 11s
# 정상적 Scale up 확인
controlplane ~ ➜ k get pods
NAME READY STATUS RESTARTS AGE
app-5646649cc9-74hw9 1/1 Running 0 31m
app-5646649cc9-xx2rt 1/1 Running 0 40s
controlplane ~ ➜ k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
app 2/2 2 2 32m
[실습- 3]
Something is wrong with scaling again. We just tried scaling the deployment to 3 replicas. But it's not happening.
controlplane ~ ➜ kubectl get pods
NAME READY STATUS RESTARTS AGE
app-5646649cc9-74hw9 1/1 Running 0 36m
app-5646649cc9-xx2rt 1/1 Running 0 5m4s
# 2개의 Pod만 동작 중
controlplane ~ ➜ kubectl get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
app 2/3 2 2 36m
controlplane ~ ➜ k describe deploy app
Name: app
Namespace: default
CreationTimestamp: Sun, 28 Apr 2024 04:56:56 +0000
Labels: app=app
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=app
Replicas: 3 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=app
Containers:
nginx:
Image: nginx:alpine
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: app-5646649cc9 (2/2 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 36m deployment-controller Scaled up replica set app-5646649cc9 to 1
Normal ScalingReplicaSet 5m37s deployment-controller Scaled up replica set app-5646649cc9 to 2 from 1
controlplane ~ ➜ k get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 43m
coredns-69f9c977-trp5s 1/1 Running 0 43m
etcd-controlplane 1/1 Running 0 43m
kube-apiserver-controlplane 1/1 Running 0 43m
kube-controller-manager-controlplane 0/1 Error 4 (48s ago) 103s # 에러 !
kube-proxy-2bnj5 1/1 Running 0 43m
kube-scheduler-controlplane 1/1 Running 0 26m
controlplane ~ ✖ kubectl logs kube-controller-manager-controlplane -n kube-system
I0428 05:34:30.777020 1 serving.go:380] Generated self-signed cert in-memory
E0428 05:34:31.234509 1 run.go:74] "command failed" err="unable to load client CA provider: open /etc/kubernetes/pki/ca.crt: no such file or directory"
# 파일은 존재한다.
controlplane ~ ➜ ls /etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/ca.crt
controlplane ~ ➜ vi /etc/kubernetes/manifests/kube-controller-manager.yaml
...
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/kubernetes/pki # 여기가 잘못되어있었음
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
- hostPath:
path: /usr/local/share/ca-certificates
type: DirectoryOrCreate
name: usr-local-share-ca-certificates
- hostPath:
path: /usr/share/ca-certificates
type: DirectoryOrCreate
name: usr-share-ca-certificates
status: {}
controlplane ~ ➜ k get pods -n kube-system --watch
NAME READY STATUS RESTARTS AGE
coredns-69f9c977-jxldq 1/1 Running 0 51m
coredns-69f9c977-trp5s 1/1 Running 0 51m
etcd-controlplane 1/1 Running 0 51m
kube-apiserver-controlplane 1/1 Running 0 51m
kube-controller-manager-controlplane 0/1 Running 0 6s
kube-proxy-2bnj5 1/1 Running 0 51m
kube-scheduler-controlplane 1/1 Running 0 34m
kube-controller-manager-controlplane 0/1 Running 0 11s
kube-controller-manager-controlplane 1/1 Running 0 11s
반응형
'Container > Kubernetes' 카테고리의 다른 글
[K8S] ETCD (1) | 2024.12.24 |
---|---|
[K8S] 컨테이너 런타임 (0) | 2024.12.24 |
[K8S] Multiple Container (0) | 2024.04.28 |
[K8S] SW Version & Cluster Upgrade (0) | 2024.04.26 |
[K8S] Logging & Monitoring (1) | 2024.04.26 |