Kubernetes PodGC Controller怎么配置

160次閱讀

共計 7114 個字符，預計需要花費 18 分鐘才能閱讀完成。

本篇內容介紹了“Kubernetes PodGC Controller 怎么配置”的有關知識，在實際案例的操作過程中，不少人都會遇到這樣的困境，接下來就讓丸趣 TV 小編帶領大家學習一下如何處理這些情況吧！希望大家仔細閱讀，能夠學有所成！

PodGC Controller 配置

關于 PodGC Controller 的相關配置（kube-controller-manager 配置），一共只有兩個：

flagdefault valuecomments–controllers stringSlice* 這里配置需要 enable 的 controlllers 列表，podgc 當然也可以在這里設置是都要 enable or disable，默認 podgc 是在 enable 列表中的。–terminated-pod-gc-threshold int3212500Number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If = 0, the terminated pod garbage collector is disabled. (default 12500)PodGC Controller 入口

PodGC Controller 是在 kube-controller-manager Run 的時候啟動的。CMServer Run 時會 invoke StartControllers 將預先注冊的 enabled Controllers 遍歷并逐個啟動。

cmd/kube-controller-manager/app/controllermanager.go:180
func Run(s *options.CMServer) error {
 ...
 err := StartControllers(newControllerInitializers(), s, rootClientBuilder, clientBuilder, stop)
}

在 newControllerInitializers 注冊了所有一些常規 Controllers 及其對應的 start 方法，為什么說這些是常規的 Controllers 呢，因為還有一部分 Controllers 沒在這里進行注冊，比如非常重要的 service Controller，node Controller 等，我把這些稱為非常規 Controllers。

func newControllerInitializers() map[string]InitFunc {controllers := map[string]InitFunc{}
 controllers[endpoint] = startEndpointController
 controllers[podgc] = startPodGCController
 return controllers
}

因此 CMServer 最終是 invoke startPodGCController 來啟動 PodGC Controller 的。

cmd/kube-controller-manager/app/core.go:66
func startPodGCController(ctx ControllerContext) (bool, error) {
 go podgc.NewPodGC(ctx.ClientBuilder.ClientOrDie( pod-garbage-collector),
 ctx.InformerFactory.Core().V1().Pods(),
 int(ctx.Options.TerminatedPodGCThreshold),
 ).Run(ctx.Stop)
 return true, nil
}

startPodGCController 內容很簡單，啟動一個 goruntine 協程，創建 PodGC 并啟動執行。

PodGC Controller 的創建

我們先來看看 PodGCController 的定義。

pkg/controller/podgc/gc_controller.go:44
type PodGCController struct {
 kubeClient clientset.Interface
 podLister corelisters.PodLister
 podListerSynced cache.InformerSynced
 deletePod func(namespace, name string) error
 terminatedPodThreshold int
}

kubeClient: 用來跟 APIServer 通信的 client。

PodLister: PodLister helps list Pods.

podListerSynced: 用來判斷 PodLister 是否 Has Synced。

deletePod: 調用 apiserver 刪除對應 pod 的接口。

terminatedPodThreshold: 對應 –terminated-pod-gc-threshold 的配置，默認為 12500。

pkg/controller/podgc/gc_controller.go:54
func NewPodGC(kubeClient clientset.Interface, podInformer coreinformers.PodInformer, terminatedPodThreshold int) *PodGCController {if kubeClient != nil   kubeClient.Core().RESTClient().GetRateLimiter() != nil {metrics.RegisterMetricAndTrackRateLimiterUsage( gc_controller , kubeClient.Core().RESTClient().GetRateLimiter())
 gcc :=  PodGCController{
 kubeClient: kubeClient,
 terminatedPodThreshold: terminatedPodThreshold,
 deletePod: func(namespace, name string) error {glog.Infof( PodGC is force deleting Pod: %v:%v , namespace, name)
 return kubeClient.Core().Pods(namespace).Delete(name, metav1.NewDeleteOptions(0))
 gcc.podLister = podInformer.Lister()
 gcc.podListerSynced = podInformer.Informer().HasSynced
 return gcc
}

創建 PodGC Controller 時其實只是把相關的 PodGCController 元素進行賦值。注意 deletePod 方法定義時的參數 metav1.NewDeleteOptions(0)，表示立即刪除 pod，沒有 grace period。

PodGC Controller 的運行

創建完 PodGC Controller 后，接下來就是執行 Run 方法啟動執行了。

pkg/controller/podgc/gc_controller.go:73
func (gcc *PodGCController) Run(stop  -chan struct{}) {if !cache.WaitForCacheSync(stop, gcc.podListerSynced) {utilruntime.HandleError(fmt.Errorf( timed out waiting for caches to sync))
 return
 go wait.Until(gcc.gc, gcCheckPeriod, stop)
 -stop
}

每 100ms 都會去檢查對應的 PodLister 是否 Has Synced，直到 Has Synced。

啟動 goruntine 協程，每執行完一次 gcc.gc 進行 Pod 回收后，等待 20s，再次執行 gcc.gc，直到收到 stop 信號。

pkg/controller/podgc/gc_controller.go:83
func (gcc *PodGCController) gc() {pods, err := gcc.podLister.List(labels.Everything())
 if err != nil {glog.Errorf( Error while listing all Pods: %v , err)
 return
 if gcc.terminatedPodThreshold   0 {gcc.gcTerminated(pods)
 gcc.gcOrphaned(pods)
 gcc.gcUnscheduledTerminating(pods)
}

gcc.gc 是最終的 pod 回收邏輯：

調從 PodLister 中去除所有的 pods（不設置過濾）

如果 terminatedPodThreshold 大于 0，則調用 gcc.gcTerminated(pods)回收那些超出 Threshold 的 Pods。

調用 gcc.gcOrphaned(pods)回收 Orphaned pods。

調用 gcc.gcUnscheduledTerminating(pods)回收 UnscheduledTerminating pods。

注意：

gcTerminated 和 gcOrphaned，gcUnscheduledTerminating 這三個 gc 都是串行執行的。

gcTerminated 刪除超出閾值的 pods 的刪除動作是并行的，通過 sync.WaitGroup 等待所有對應的 pods 刪除完成后，gcTerminated 才會結束返回，才能開始后面的 gcOrphaned.

gcOrphaned，gcUnscheduledTerminatin，gcUnscheduledTerminatin 內部都是串行 gc pods 的。

回收那些 Terminated 的 pods

func (gcc *PodGCController) gcTerminated(pods []*v1.Pod) {terminatedPods := []*v1.Pod{}
 for _, pod := range pods {if isPodTerminated(pod) {terminatedPods = append(terminatedPods, pod)
 terminatedPodCount := len(terminatedPods)
 sort.Sort(byCreationTimestamp(terminatedPods))
 deleteCount := terminatedPodCount - gcc.terminatedPodThreshold
 if deleteCount   terminatedPodCount {
 deleteCount = terminatedPodCount
 if deleteCount   0 {glog.Infof( garbage collecting %v pods , deleteCount)
 var wait sync.WaitGroup
 for i := 0; i   deleteCount; i++ {wait.Add(1)
 go func(namespace string, name string) {defer wait.Done()
 if err := gcc.deletePod(namespace, name); err != nil {
 // ignore not founds
 defer utilruntime.HandleError(err)
 }(terminatedPods[i].Namespace, terminatedPods[i].Name)
 wait.Wait()}

遍歷所有 pods，過濾出所有 Terminated Pods（Pod.Status.Phase 不為 Pending, Running, Unknow 的 Pods）.

計算 terminated pods 數與 terminatedPodThreshold 的 (超出) 差值 deleteCount。

啟動 deleteCount 數量的 goruntine 協程，并行調用 gcc.deletePod（invoke apiserver s api）方法立刻刪除對應的 pod。

回收那些 Binded 的 Nodes 已經不存在的 pods

// gcOrphaned deletes pods that are bound to nodes that don t exist.
func (gcc *PodGCController) gcOrphaned(pods []*v1.Pod) {glog.V(4).Infof(GC ing orphaned)
 // We want to get list of Nodes from the etcd, to make sure that it s as fresh as possible.
 nodes, err := gcc.kubeClient.Core().Nodes().List(metav1.ListOptions{})
 if err != nil {
 return
 nodeNames := sets.NewString()
 for i := range nodes.Items {nodeNames.Insert(nodes.Items[i].Name)
 for _, pod := range pods {
 if pod.Spec.NodeName ==   {
 continue
 if nodeNames.Has(pod.Spec.NodeName) {
 continue
 glog.V(2).Infof(Found orphaned Pod %v assigned to the Node %v. Deleting. , pod.Name, pod.Spec.NodeName)
 if err := gcc.deletePod(pod.Namespace, pod.Name); err != nil {utilruntime.HandleError(err)
 } else {glog.V(0).Infof(Forced deletion of orphaned Pod %s succeeded , pod.Name)
}

gcOrphaned 用來刪除那些 bind 的 node 已經不存在的 pods。

調用 apiserver 接口，獲取所有的 Nodes。

遍歷所有 pods，如果 pod bind 的 NodeName 不為空且不包含在剛剛獲取的所有 Nodes 中，則串行逐個調用 gcc.deletePod 刪除對應的 pod。

回收 Unscheduled 并且 Terminating 的 pods

pkg/controller/podgc/gc_controller.go:167
// gcUnscheduledTerminating deletes pods that are terminating and haven t been scheduled to a particular node.
func (gcc *PodGCController) gcUnscheduledTerminating(pods []*v1.Pod) {glog.V(4).Infof(GC ing unscheduled pods which are terminating.)
 for _, pod := range pods {if pod.DeletionTimestamp == nil || len(pod.Spec.NodeName)   0 {
 continue
 glog.V(2).Infof(Found unscheduled terminating Pod %v not assigned to any Node. Deleting. , pod.Name)
 if err := gcc.deletePod(pod.Namespace, pod.Name); err != nil {utilruntime.HandleError(err)
 } else {glog.V(0).Infof(Forced deletion of unscheduled terminating Pod %s succeeded , pod.Name)
}

gcUnscheduledTerminating 刪除那些 terminating 并且還沒調度到某個 node 的 pods。

遍歷所有 pods，過濾那些 terminating(pod.DeletionTimestamp != nil)并且未調度成功的 (pod.Spec.NodeName 為空) 的 pods。

串行逐個調用 gcc.deletePod 刪除對應的 pod。

“Kubernetes PodGC Controller 怎么配置”的內容就介紹到這里了，感謝大家的閱讀。如果想了解更多行業相關的知識可以關注丸趣 TV 網站，丸趣 TV 小編將為大家輸出更多高質量的實用文章！

正文完