Kubernetes Removals, Deprecations, and Major Changes in 1.26

CRI v1alpha2 삭제와 containerd 1.5 지원

Removal

flowcontrol.apiserver.k8s.io/v1beta1

API Priority and Fairness

Concepts

Resources

CRI v1alpha2 삭제와 containerd 1.5 지원

CRI API

GitHub - kubernetes/cri-api: Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.

Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes. - kubernetes/cri-api

https://github.com/kubernetes/cri-api?tab=readme-ov-file#v126

vs. dockershim(v1.24 >~)

containerd >~ 1.6을 사용하자

Removal

flowcontrol.apiserver.k8s.io/v1beta1

flow control?: api 호출 과부하 상황에서 priority와 fairness로 제어함

•

과부하와 같은 클러스터 장애 상황에서 디버깅에?? 쓰임 

•

stable인 1.29 기준

Flow control

API Priority and Fairness controls the behavior of the Kubernetes API server in an overload situation. You can find more information about it in the API Priority and Fairness documentation. Diagnostics Every HTTP response from an API server with the priority and fairness feature enabled has two extra headers: X-Kubernetes-PF-FlowSchema-UID and X-Kubernetes-PF-PriorityLevel-UID, noting the flow schema that matched the request and the priority level to which it was assigned, respectively.

https://kubernetes.io/docs/reference/debug-cluster/flow-control/

•

각각 요청 헤더로 feature 활성화

◦

flow schema: X-Kubernetes-PF-FlowSchema-UID

◦

priority level: X-Kubernetes-PF-PriorityLevel-UID

◦

객체 이름을 헤더(이름)에 노출하지 않는다.

▪

api 사용자로부터 가리기 위함

k get flowschemas -o custom-columns="uid:{metadata.uid},name:{metadata.name}"
k get prioritylevelconfigurations -o custom-columns="uid:{metadata.uid},name:{metadata.name}"
Shell
복사

◦

APIPriorityAndFairness 피쳐 게이트 활성화 + nonResourceURLs인 /debug/api_priority_and_fairness/  에 대한 RBAC이 구성되어 있으면 디버깅 가능

▪

1.20, beta 부터 default true

◦

/debug/api_priority_and_fairness/dump_priority_levels

▪

IsQuiescing : 큐가 비워지면 priority level을 제거할지

◦

/debug/api_priority_and_fairness/dump_queues

API Priority and Fairness

•

kube-apiserver 요청 과부하 상황에서, 충분하진 않지만, 제어할 수 있는 옵션:

◦

--max-requests-inflight

◦

--max-mutating-requests-inflight

•

APF

◦

요청을 fine-grained way로 분류 및 격리

◦

limited queue를 사용하여 burst 요청도 버려지지 않게(?)

◦

fair queueing → (우선순위가 같더라도) 성능이 구린 컨트롤러가 굶지(starve) 않도록

◦

informer, 지수 백오프를 쓰는 등 보통의 클라이언트에서 잘 작동함

Concepts

•

FlowSchema

◦

들어오는 요청을 속성에 따라 분류

◦

우선순위 레벨에 할당

•

우선순위 레벨은 서로 다른 레벨은 starve하지 않게 하여 어느정도 요청들을 격리 시킴

•

fair-queueing 알고리즘

◦

서로 다른 flows가 starving하는걸 방지

◦

평균부하가 충분히 낮을 때 bursty traffic으로 인한 요청 실패를 방지

우선순위 레벨

•

max-requests-inflight 와 max-mutating-requests-inflight 합을 ‘구성 가능한’ 우선순위 레벨 집합으로 나눔.

•

요청을 각각 하나의 우선순위 레벨에 할당됨

•

우선순위 레벨은 각각의 제한량만큼 동시에(concurrent) 요청함

•

e.g. 기본 구성 (starving 되지 않도록)

◦

leader-election

◦

빌트인 컨트롤러 요청

◦

파드에서 오는 요청

•

동시성 한계는 주기적으로 조정된다.

◦

utilization이 낮은 우선순위 레벨이 높은 것에게 빌려주는 식으로

◦

아래 객체들을 통해, 명목상 빌릴 수 있는/빌려줄 양의 제한과 경계가 있다.

요청이 차지하는 자리(seats)

•

(요청의 처리 기간과 무관하게 특정 시점에) 요청을 하나당 하나의 자리를 차지한다.

•

list 같은 요청은 자리를 더 차지한다 → 반환 객체 수에 비례해 자리를 줌

watch 요청 실행 시간 조정

•

watch에 대한 예외

◦

자리 차지하는 기간

▪

param에 따라 watch는 변경되는 객체에 대한 noti를 만든(들 수도 있)다

▪

이 초기 noti burst 가 끝난 후에 watch에 대한 자리를 준다.

◦

noti는 객체 생성/갱신/삭제마다 watch 응답 스트림에 burst로 보낸다.

▪

감안하여 (noti?)쓰기 요청?? 완료 후 grace period를 둔다

▪

보낼 알림 수를 추정과 grace period를 고려하여 자리 수와 점유 시간을 조정한다.

Queueing

•

같은 우선순위 레벨에서도 여러 클라이언트(=스트림)이 있다. 서로 영향주지 않는 것이 좋다.

•

fair-queueing 알고리즘

◦

flow

▪

요청마다 할당됨

▪

FlowSchema + flow distinguisher(요청 사용자, 대상 리소스 네임스페이스 또는 nothing)

▪

같은 PL의 각 flow에 거의 같은 가중치를 주려고 함

▪

컨트롤러에서 개별 사용자를 이름으로 인증해야 개별적이 처리가 가능하다.

▪

shuffle sharding을 사용해 요청을 flow에 할당

•

low/hi intensity를 격리할 수 있다 하네요 

◦

PL마다 튜닝 가능

▪

메모리와 trade-off

▪

fairness(트래픽이 cap 이상 시 모든 flow가 흐르게 하는 정도)

▪

bursty traffic에 대한 내결함성

▪

queueing을 위한 대기 시간

면제 요청(exempt requests)

•

아주 중요한 요청들은 이걸 무시하도록…

Resources

PriorityLevelConfiguration(PLC)

•

하나의 PL

•

nominalConcurrencyShares 

◦

자리 수가 아닌 비율(share)로써 관리함

◦

명목 제한을 계산함

▪

NominalCL(i) = ceil( ServerCL * NCS(i) / sum_ncs ) sum_ncs = sum[우선순위 수준 k] NCS(k) 

◦

기본 값 30

◦

0일 경우 감옥(PL의 일부 요청을 보관) 구성

•

borrowingLimitPercent 

◦

명목 제한 곱의 반올림한 절대 자리 수로 표현: BorrowingCL(i) = round( NominalCL(i) * borrowingLimitPercent(i)/100.0 ) 

•

lendablePercent

◦

LendableCL(i) = round( NominalCL(i) * lendablePercent(i)/100.0 )

•

명목 CL 하한 - LendableCL ≤ 동적 조정된 CL ≤ 명목 CL 상한 + BorrowingCL

FlowSchema