logo

containerd Cheatsheet

Last Updated: 2024-02-06
# containerd view logs
$ journalctl -u containerd

ctr vs crictl:

  • ctr: containerd CLI, not related to k8s.
  • crictl: CRI Compatible container runtime command line interface, related to k8s.

crictl image = ctr -n=k8s.io images ls

kind load uses "ctr", "--namespace=k8s.io", "images", "import", "--digests", "--snapshotter="+snapshotter, "-"

what's in containerd config

Config file: /etc/containerd/config.toml

  • sandbox_image (you can overwrite the pause image)
  • default runtime, e.g. "runc"
  • registry auth/ca/mirrors

Registry

version = 2

[plugins."io.containerd.grpc.v1.cri".registry]
   config_path = "/etc/containerd/certs.d"

Per registry config:

$ tree /etc/containerd/certs.d
/etc/containerd/certs.d
└── docker.io
    └── hosts.toml

Check number of sandboxes and containers

# check the number of pod sandboxes:
$ ls /var/lib/containerd/io.containerd.grpc.v1.cri/sandboxes/ | wc -l

# check the number of containers
$ ls /var/lib/containerd/io.containerd.grpc.v1.cri/containers/ | wc -l

How to find and kill NotReady Pods

kubelet maintains a GC mechanism that scans for dead pods and remove them. The GC mechanism runs every 1m. It calls containerd (essentially like crictli pods) to get a list of pods that are currently running and issue delete commands.

If during the short time period, there are a lot of pods come and go, the pod sandboxes are kept until the next GC kicked in.

If you see errors like this, maybe there are too many Pods (possibily many stuck NotReady pods), so when it tries to get a list of Pods (ListAllSandboxes), the size exceeds the grpc limit (in this case 16 MB).

rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (16794825 vs. 16777216)

If kubelet finds containerd not healthy, it will restart containerd; if kubelet keeps restarting containerd, npd will report a FrequentContainerdRestart condition.

crictl rmp can be used to delete Pod; a few ways to delete NotReady pods:

$ crictl rmp $(crictl pods -q --s NotReady)
$ crictl pods -state NotReady -o json | jq -r '.items[].id' | xargs -I% crictl rmp %
$ crictl pods | grep NotReady | cut -f1 -d" " | xargs -L 1 -I {} -t crictl rmp {}
# truncate name and count occurance
$ crictl pods -state NotReady -o json | jq -r '.items[].metadata.name' | cut -c1-20 | sort | uniq -c | sort

If you see the following error, containerd may be busy, stop creating new Pods first (e.g. scale down the deployments that may create new pods down to 0)

"RemovePodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="xxxxxxxx"

Get the latest pod

$ crictl pods --latest