K8S中的孤立资源

孤立资源(Orphaned Resources)

背景

孤立资源是指已经被删除的数据,但是还在 etcd 中。让我简单举个例子来说明一下,之前删除 namespace 的时候,使用了强行删除的方法,之后再检查资源的时候发现还存在:

1
2
3
4
5
root@master1:~# kubectl  get pod -n kb-system
NAME READY STATUS RESTARTS AGE
kb-addon-snapshot-controller-7df64b4d5b-jl56q 0/1 Error 6718 255d
kubeblocks-77c594d646-nnl82 0/1 CreateContainerConfigError 6 (31m ago) 255d
kubeblocks-dataprotection-f8dd6659b-2l8pd 0/1 CreateContainerConfigError 6 (31m ago) 255d

如果这个时候我们再去删除会提示不存在:

1
2
3
root@master1:~# kubectl  delete ns kb-system 
Error from server (NotFound): namespaces "kb-system" not found

1
2
3
4
5
root@master1:~# kubectl delete pod  $(kubectl  get pod -n kb-system | awk '{print $1}') -n kb-system
Error from server (NotFound): pods "NAME" not found
Error from server (NotFound): namespaces "kb-system" not found
Error from server (NotFound): namespaces "kb-system" not found
Error from server (NotFound): namespaces "kb-system" not found

像这种就是孤立资源。像孤立资源也会占用 k8s 里面的资源。导致我们系统性能下降。比如:

  • API Server 性能下降;
  • etcd 数据膨胀;
  • 甚至影响 namespace 删除或重新创建。

清理方式

之前之所以删除失败,是因为 API Server 会先检查 namespace 会不会存在,因为我们之前是使用 API 方式去删除。对于 API Server 来说是不存在的了。

所以我们需要在 etcd 中去删除原数据,导入 etcd 证书,这里根据具体情况来导入:

1
2
3
4
5
6
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
export ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
export ETCDCTL_CERT=/etc/ssl/etcd/ssl/admin-master1.pem
export ETCDCTL_KEY=/etc/ssl/etcd/ssl/admin-master1-key.pem

之后使用 etcdctl 进行测试:

1
2
3
4
5
6
etcdctl member list

# 如果出现我们 etcd 节点就说嘛没有问题
3f2f31fb84eeea53, started, etcd-master3, https://10.10.254.14:2380, https://10.10.254.14:2379, false
4c83f4ebe5783013, started, etcd-master2, https://10.10.254.13:2380, https://10.10.254.13:2379, false
8079aa45283ecb82, started, etcd-master1, https://10.10.254.12:2380, https://10.10.254.12:2379, false

接着查看相关 pod 的 key:

1
2
3
4
5
etcdctl get --prefix /registry/pods/ --keys-only | grep
/registry/pods/kb-system/kb-addon-snapshot-controller-7df64b4d5b-jl56q
/registry/pods/kb-system/kubeblocks-77c594d646-nnl82
/registry/pods/kb-system/kubeblocks-dataprotection-f8dd6659b-2l8pd

最后执行删除:

1
2
3
for prefix in pods namespaces configmaps secrets deployments replicasets services; do
etcdctl del --prefix /registry/$prefix/kb-system/ || true
done

再次验证,相关 pod 已经删除:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
NAMESPACE     NAME                                READY   STATUS    RESTARTS         AGE
default low-resource-app-6c94c7f769-6vj77 1/1 Running 7 (138m ago) 190d
kube-system cilium-7b9l9 1/1 Running 587 (138m ago) 257d
kube-system cilium-kjdsz 1/1 Running 587 (138m ago) 257d
kube-system cilium-kk8v4 1/1 Running 587 (138m ago) 257d
kube-system cilium-operator-fddf55d65-pldkq 1/1 Running 672 (138m ago) 257d
kube-system cilium-pbz67 1/1 Running 588 (139m ago) 257d
kube-system cilium-qxqnq 1/1 Running 587 (139m ago) 257d
kube-system cilium-xftk8 1/1 Running 0 7m4s
kube-system coredns-794b9d47c4-k54k8 1/1 Running 9 (138m ago) 257d
kube-system coredns-794b9d47c4-zgvmk 1/1 Running 9 (138m ago) 257d
kube-system kube-apiserver-master1 1/1 Running 558 (139m ago) 257d
kube-system kube-apiserver-master2 1/1 Running 554 (139m ago) 257d
kube-system kube-apiserver-master3 1/1 Running 555 (138m ago) 257d
kube-system kube-controller-manager-master1 1/1 Running 11 (139m ago) 257d
kube-system kube-controller-manager-master2 1/1 Running 9 (139m ago) 257d
kube-system kube-controller-manager-master3 1/1 Running 9 (138m ago) 257d
kube-system kube-proxy-4tsfp 1/1 Running 9 (138m ago) 257d
kube-system kube-proxy-g5fps 1/1 Running 9 (138m ago) 257d
kube-system kube-proxy-k2wf7 1/1 Running 9 (139m ago) 257d
kube-system kube-proxy-kzlmt 1/1 Running 9 (138m ago) 257d
kube-system kube-proxy-pj6kr 1/1 Running 9 (138m ago) 257d
kube-system kube-proxy-ssc49 1/1 Running 9 (139m ago) 257d
kube-system kube-scheduler-master1 1/1 Running 11 (139m ago) 257d
kube-system kube-scheduler-master2 1/1 Running 9 (139m ago) 257d
kube-system kube-scheduler-master3 1/1 Running 9 (138m ago) 257d
kube-system metrics-server-75bf97fcc9-pvrkl 0/1 Running 4 (138m ago) 190d
kube-system nodelocaldns-8gp4s 1/1 Running 9 (138m ago) 257d
kube-system nodelocaldns-cdpl2 1/1 Running 9 (138m ago) 257d
kube-system nodelocaldns-frqvs 1/1 Running 9 (138m ago) 257d
kube-system nodelocaldns-jr6b6 1/1 Running 9 (139m ago) 257d
kube-system nodelocaldns-kpvqg 1/1 Running 9 (139m ago) 257d
kube-system nodelocaldns-lprdx 1/1 Running 9 (138m ago) 257d

孤立资源/幽灵资源生成过程

k8s 删除资源是一个异步过程,比如我们删除 namespace,他需要先删除 namespace 下面所有资源,比如 pod/deployment 等等。

kube-apiserver 收到来自 kubectl 或者客户端请求的时候,会给对应 namespace 打上标记:

1
2
3
4
5
6
status:
phase: Terminating
spec:
finalizers:
- kubernetes

最后按照之前说的清理掉资源, 最后删除 namespace。

flowchart TD
    A["kubectl delete ns"] --> B["API Server 标记 Namespace 为 Terminating"]
    B --> C["Namespace Controller 开始删除该命名空间下的所有子资源"]
    C --> D["所有子资源(Pods / Services / Deployments / CRDs)清空"]
    D --> E["Finalizer 被移除"]
    E --> F["etcd 删除 /registry/namespaces/<ns> 键"]
    F --> G["Namespace 完全删除"]

而如果我们是强行删除的话,就会绕过控制器的正常清理逻辑。资源不一定会被清空,有可能会留在 etcd 中,而留在 etcd 中的就是孤立资源。

参考

  1. https://help.aliyun.com/zh/asm/support/how-do-i-delete-a-namespace-in-the-terminating-state
  2. https://www.stackstate.com/blog/orphaned-resources-in-kubernetes-detection-impact-and-prevention-tips/
  3. https://argo-cd.readthedocs.io/en/release-2.11/user-guide/orphaned-resources/

K8S中的孤立资源
http://example.com/2025/10/19/K8S中的孤儿资源/
作者
Wu Liang
发布于
2025年10月19日
许可协议