본문 바로가기

Container/Kubernetes

[K8S] Job Controller

Job이란
Kubernetes는 Pod를 running 중인 상태로 유지
Batch 처리하는 Pod는 작업이 완료되면 종료된다.
Batch 처리에 적합한 컨트롤러로 pod의 성공적인 완료를 보장한다.
- 비정상 종료 시 다시 실행
- 정상 종료 시 완료


[master ~]$ kubectl run testpod --image=centos:7 --command sleep 5
pod/testpod created
[master ~]$ kubectl get pods --watch
NAME      READY   STATUS              RESTARTS   AGE
testpod   0/1     ContainerCreating   0          5s
testpod   1/1     Running             0          9s
testpod   0/1     Completed           0          14s
testpod   1/1     Running             1 (1s ago)   15s
testpod   0/1     Completed           1 (6s ago)   20s
testpod   0/1     CrashLoopBackOff    1 (13s ago)   32s
testpod   1/1     Running             2 (14s ago)   33s
testpod   0/1     Completed           2 (19s ago)   38s
testpod   0/1     CrashLoopBackOff    2 (13s ago)   50s
# 쿠버네티스는 기본적으로 Running중인 Pod를 보장해준다.
# 항상 running 중인 상태로 유지시켜주기 때문에, 다시 restart 시켜준다.

[master ~]$ kubectl delete pod testpod  5
pod "testpod" deleted


# 하지만 Backup 같은 작업은 한번 돌면 끝인데?
# 정상종료가 되면 끝!이 되도록 보장해주는게 Job Controller

 

[따배쿠] 6-6. 쿠버네티스 Job Controller (youtube.com)

soyun3963@cloudshell:~$ cat > job-exam.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
#  completions: 5
#  parallelism: 2
#  activeDeadlineSeconds: 5
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bash"]
        args:
        - "-c"
        - "echo 'Hello World'; sleep 25; echo 'Bye'"
      restartPolicy: Never
#      restartPolicy: OnFailure
#  backoffLimit: 3



soyun3963@cloudshell:~/job-controller$ kubectl create -f job-exam.yaml 
job.batch/centos-job created

# 정상 종료
Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 01:21:15 2024

NAME               READY   STATUS      RESTARTS   AGE   IP         NODE                                       NOMINATED NODE   READINESS GATES
centos-job-lwxvl   0/1     Completed   0          70s   10.4.2.6   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>


# 작업 상태 확인
soyun3963@cloudshell:~/job-controller$ kubectl get jobs
NAME         COMPLETIONS   DURATION   AGE
centos-job   1/1           38s        2m41s # 38초 뒤에 동작되는 작업

soyun3963@cloudshell:~/job-controller$ kubectl describe job centos-job
Name:             centos-job
Namespace:        default
...
Events:
  Type    Reason            Age    From            Message
  ----    ------            ----   ----            -------
  Normal  SuccessfulCreate  5m28s  job-controller  Created pod: centos-job-lwxvl
  Normal  Completed         4m51s  job-controller  Job completed

# 파드가 삭제되면 그 컨테이너가 어떤 로그를 남겼는지, 동작시켰는지 확인할 수 있도록 완료된 작업을 보존시켜준다.
# 작업을 삭제하고 싶으면 Pod가 아닌(Controller가 따로 있음) job을 지워줘야 한다.

soyun3963@cloudshell:~/job-controller$ kubectl delete jobs centos-job
job.batch "centos-job" deleted

 

# OnFailure 옵션

# backoffLimit의 기본값은 6
soyun3963@cloudshell:~/job-controller$ cat job-exam.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
#  completions: 5
#  parallelism: 2
#  activeDeadlineSeconds: 5
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bashc"] #오류가 있는 command 실행
        args:
        - "-c"
        - "echo 'Hello World'; sleep 50; echo 'Bye'"
#      restartPolicy: Never
      restartPolicy: OnFailure # 컨테이너 비정상 종료시 컨테이너 restart (최대 3번까지)
  backoffLimit: 3
  
  
Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 01:36:34 2024

NAME               READY   STATUS              RESTARTS     AGE   IP         NODE                                       NOMINATED NODE   READINESS GATES
centos-job-vj9ms   0/1     RunContainerError   1 (6s ago)   7s    10.4.2.7   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
# 3번까지 reastart 진행해주고 파드가 삭제된다
Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 01:37:18 2024

No resources found in default namespace.

# Never 옵션

soyun3963@cloudshell:~/job-controller$ cat job-exam.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
#  completions: 5
#  parallelism: 2
#  activeDeadlineSeconds: 5
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bashc"] #오류가 있는 command 실행
        args:
        - "-c"
        - "echo 'Hello World'; sleep 50; echo 'Bye'"
      restartPolicy: Never
#      restartPolicy: OnFailure # 컨테이너 비정상 종료시 컨테이너 restart (최대 3번까지)
# backoffLimit: 3 # pod restart라서 의미 없음

Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:16:12 2024

NAME               READY   STATUS       RESTARTS   AGE   IP          NODE                                       NOMINATED NODE   READINESS GATES
centos-job-c6jfd   0/1     StartError   0          33s   10.4.2.8    gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-hb7hf   0/1     StartError   0          2s    10.4.2.10   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-q5mwl   0/1     StartError   0          22s   10.4.2.9    gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>

# 컨테이너를 restart 시키는게 아닌, pod를 다시 실행시킨다.

 

# Compeltions : 실행해야 할 job의 수가 몇 개인지 지정

soyun3963@cloudshell:~/job-controller$ cat job-exam.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
  completions: 3
#  parallelism: 2
#  activeDeadlineSeconds: 5
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bash"]
        args:
        - "-c"
        - "echo 'Hello World'; sleep 5; echo 'Bye'"
      restartPolicy: Never
#      restartPolicy: OnFailure # 컨테이너 비정상 종료시 컨테이너 restart (최대 3번까지)
# backoffLimit: 3 # pod restart라서 의미 없음

Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:24:41 2024

NAME               READY   STATUS      RESTARTS   AGE   IP          NODE                                       NOMINATED NODE   READINESS GATES
centos-job-76cvh   1/1     Running     0          3s    10.4.2.16   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-gtwdn   0/1     Completed   0          20s   10.4.2.14   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-qg4cm   0/1     Completed   0          11s   10.4.2.15   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>

# 하나씩 순차적으로 3번 실행한다.

 

# Parallelism : 병렬성, 동시에 running되는 Pod 수

# 동시에 실행시키고 싶다면?  parallelism
soyun3963@cloudshell:~/job-controller$ cat job-exam.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
  completions: 5 
  parallelism: 2 # 병렬 2개까지 보장
#  activeDeadlineSeconds: 5
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bash"]
        args:
        - "-c"
        - "echo 'Hello World'; sleep 5; echo 'Bye'"
      restartPolicy: Never
#      restartPolicy: OnFailure # 컨테이너 비정상 종료시 컨테이너 restart (최대 3번까지)
# backoffLimit: 3 # pod restart라서 의미 없음

Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:26:31 2024

NAME               READY   STATUS    RESTARTS   AGE   IP          NODE                                       NOMINATED NODE   READINESS GATES
centos-job-dwrlb   1/1     Running   0          3s    10.4.2.17   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-pf4gb   1/1     Running   0          3s    10.4.2.18   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>



Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:27:39 2024

NAME               READY   STATUS      RESTARTS   AGE   IP          NODE                                       NOMINATED NODE   READINESS GATES
centos-job-dwrlb   0/1     Completed   0          72s   10.4.2.17   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-kk6hv   0/1     Completed   0          63s   10.4.2.20   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-m65j4   0/1     Completed   0          63s   10.4.2.19   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-pf4gb   0/1     Completed   0          72s   10.4.2.18   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
centos-job-zvs25   0/1     Completed   0          54s   10.4.2.21   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>

#  activeDeadlineSeconds : 지정 시간 내에 job 완료

soyun3963@cloudshell:~/job-controller$ cat job-exam.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  name: centos-job
spec:
#  completions: 5 
#  parallelism: 2 # 병렬 2개까지 보장
  activeDeadlineSeconds: 5 # 5초안에 안끝나면 강제로 Complete 시키겠다.
  template:
    spec:
      containers:
      - name: centos-container
        image: centos:7
        command: ["bash"]
        args:
        - "-c"
        - "echo 'Hello World'; sleep 5; echo 'Bye'"
      restartPolicy: Never
#      restartPolicy: OnFailure # 컨테이너 비정상 종료시 컨테이너 restart (최대 3번까지)
# backoffLimit: 3 # pod restart라서 의미 없음

Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:31:39 2024

NAME               READY   STATUS    RESTARTS   AGE   IP          NODE                                       NOMINATED NODE   READINESS GATES
centos-job-7rvsd   1/1     Running   0          3s    10.4.2.22   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>



Every 2.0s: kubectl get pods -o wide                                                                                                           cs-997982872602-default: Wed Feb 28 02:32:12 2024

NAME               READY   STATUS        RESTARTS   AGE   IP       NODE                                       NOMINATED NODE   READINESS GATES
centos-job-5f7q2   0/1     Terminating   0          7s    <none>   gke-cluster-1-default-pool-697aeef7-ccmj   <none>           <none>
# 5초 후에 종료되는 것 확인 가능

 

 

 

 

 

반응형

'Container > Kubernetes' 카테고리의 다른 글

[K8S] Service  (0) 2024.03.05
[K8S] Cronjob Controller  (0) 2024.02.28
[K8S] Statefulset  (0) 2024.02.26
[K8S] DaemonSet  (0) 2024.02.26
[K8S] Deployment - RollingUpdate  (0) 2024.02.21