Kubernetes - Artifacts and Registries
Originally registries only store container images, but now they evolved to also store Helm charts, OCI images and other formats.
Artifacts
Containers
Think of "container" as just another packaging format.
Just like .iso
files for disk images, .deb
/.rpm
for linux packages, or .zip
/.tgz
for binary or arbitrary files.
The ecosystem is more than just a format, it includes:
- Image
- Distribute
- Runtime
- Orchestration
Unlike traditional virtualization, containerization takes place at the kernel level. Most modern operating system kernels now support the primitives necessary for containerization, including Linux with openvz
, vserver
and more recently lxc
.
A container image is a tar file containing tar files. Each of the tar file is a layer.
Read more: Containers vs VMs
OCI
At a high-level an OCI implementation would download an OCI Image then unpack that image into an OCI Runtime filesystem bundle. At this point the OCI Runtime Bundle would be run by an OCI Runtime.
blob
is flattened, not nested, all named <oci>/blobs/sha256/<digest>
, but the file may be a json or a binary.
OCI mediaType
examples:
application/vnd.oci.image.manifest.v1+json
application/vnd.cncf.helm.config.v1+json
application/vnd.cncf.helm.chart.content.v1.tar+gzip
application/vnd.docker.distribution.manifest.v2+json
application/vnd.docker.image.rootfs.diff.tar.gzip
Registries
- Docker Registry.
- zot: OCI-native container image registry, single binary; stores blobs on disk; does not Docker-format images. https://zotregistry.dev/
- Harbor:
- deploy outside of k8s cluster by docker-compose.
- deploy inside k8s cluster by helm.
- Hosted:
- Docker HUB (web: https://hub.docker.com/, registry:
docker.io
) free public repos, paid private repos, the default registry when runningdocker pull
. - Google Container Registry (gcr.io): Replaced by Google Aritifact Registry.
- Google Aritifact Registry (pkg.dev)
- quay.io by Red Hat.
- Amazon ECR
- Azure Container Registry
- IBM Cloud Container Registry
- GitHub ghcr.io
- Docker HUB (web: https://hub.docker.com/, registry:
Tools:
- crane: a tool for interacting with remote images and registries; supports both OCI and Docker formats.
Standardized: OCI Distribution Specification
OCI Distribution Specification is an effort to standardize the Docker Registry v2 API. All registries mentioned above should conform to the spec, which means the regsitries follow the same structure, and the same set of client can talk to any of those registries.
Registry Hierarchy
Registry (e.g. Harbor) > Project > Repository > Image
An image is identified by
registry_address/project/repo:tag
Default public project named library
.
Search in the registry
OCI based registries don't provide standard APIs to facilitate searching. We can use curl
to see the content of the registry:
$ curl --insecure -X GET https://192.168.x.y/v2/_catalog -u user:password
Registry Clients
Registries like Harbor has built-in UI, but no CLI. Use helm
or docker
to talk to the registry.
Use different cli:
docker
for docker imagesoras
for OCI artifactshelm
for Helm Charts
Manually Delete Artifacts of a Certain Tag
# Get the names of the artifacts
names=$(curl -u user:password https://$HOST/v2/_catalog | jq -r '.repositories | .[]')
tag="tag-to-delete"
for name in $names; do
to_delete=$(curl -u user:password https://$HOST/v2/$name/tags/list | jq -r "select(.tags[] | contains($tag))")
if [ -n "$to_delete" ]
then
# Cannot delete by tag; can only delete by digest (docker-content-digest).
digest=$(curl -sSL -I -u user:password \
-H "Accept: application/vnd.docker.distribution.manifest.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json" \
https://$HOST/v2/$name/manifests/$tag | awk '$1 == "docker-content-digest:" { print $2 }' | tr -d $'\r')
# Delete.
curl -sSL -u user:password -X DELETE https://$HOST/v2/$name/manifests/$digest
fi
done
Image Pull
Kubelet pulls container images in serial by default. See the --serialize-image-pulls
flag in https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#options.
Kubelet emits PullingImage
event once the pull request is going into the queue https://github.com/kubernetes/kubernetes/blob/43a2bb4df4ff3c255337f5ed927b8fd17c38c235/pkg/kubelet/images/image_manager.go#L152.
However, the actual containerd image pull request is sent later when the request gets to the top of the queue. https://github.com/kubernetes/kubernetes/blob/43a2bb4df4ff3c255337f5ed927b8fd17c38c235/pkg/kubelet/images/puller.go#L96
It is worth noting that the behavior is configurable through --serialize-image-pulls
, but enabling parallel image pull may also overload the node. There is no best solution to limit the parallelism yet, and it is being worked on https://github.com/kubernetes/kubernetes/pull/115220
Registry Mirror
RegistryMirror: e.g. when you try to pull image from docker.io or gcr.io, pull from private registry instead.
Especially useful in air-gapped env (i.e. no internet access)
For example:
$ crictl pull gcr.io/my-project/foo:1.0.0
- Normally:
crictl
->containerd
-> gcr.io / docker.io - With mirror:
crictl
->containerd
(readsregistrymirror
in/etc/containerd/config.toml
) -> pull from the mirror registry instead of gcr.io
Note: containerd
is different from docker, containerd
registry mirror does not work for docker pull
.
From Go
Read image index ($PATH/index.json
file):
import (
"github.com/google/go-containerregistry/pkg/v1/layout"
)
index, err := layout.ImageIndexFromPath(path)
if err != nil {
return fmt.Errorf("failed to load %s as an OCI index: %w", path, err)
}
indexManifest, err := index.IndexManifest()
if err != nil {
return fmt.Errorf("failed to get IndexManifest: %w", err)
}
// Get annotations
// indexManifest.Annotations