logo

Kubernetes - Artifacts and Registries

Originally registries only store container images, but now they evolved to also store Helm charts, OCI images and other formats.

Artifacts

Containers

Think of "container" as just another packaging format.

Just like .iso files for disk images, .deb/.rpm for linux packages, or .zip/.tgz for binary or arbitrary files.

The ecosystem is more than just a format, it includes:

  • Image
  • Distribute
  • Runtime
  • Orchestration

Unlike traditional virtualization, containerization takes place at the kernel level. Most modern operating system kernels now support the primitives necessary for containerization, including Linux with openvz, vserver and more recently lxc.

A container image is a tar file containing tar files. Each of the tar file is a layer.

Read more: Containers vs VMs

OCI

At a high-level an OCI implementation would download an OCI Image then unpack that image into an OCI Runtime filesystem bundle. At this point the OCI Runtime Bundle would be run by an OCI Runtime.

blob is flattened, not nested, all named <oci>/blobs/sha256/<digest>, but the file may be a json or a binary.

OCI mediaType examples:

  • application/vnd.oci.image.manifest.v1+json
  • application/vnd.cncf.helm.config.v1+json
  • application/vnd.cncf.helm.chart.content.v1.tar+gzip
  • application/vnd.docker.distribution.manifest.v2+json
  • application/vnd.docker.image.rootfs.diff.tar.gzip

Registries

  • Docker Registry.
  • zot: OCI-native container image registry, single binary; stores blobs on disk; does not Docker-format images. https://zotregistry.dev/
  • Harbor:
    • deploy outside of k8s cluster by docker-compose.
    • deploy inside k8s cluster by helm.
  • Hosted:
    • Docker HUB (web: https://hub.docker.com/, registry: docker.io) free public repos, paid private repos, the default registry when running docker pull.
    • Google Container Registry (gcr.io): Replaced by Google Aritifact Registry.
    • Google Aritifact Registry (pkg.dev)
    • quay.io by Red Hat.
    • Amazon ECR
    • Azure Container Registry
    • IBM Cloud Container Registry
    • GitHub ghcr.io

Tools:

  • crane: a tool for interacting with remote images and registries; supports both OCI and Docker formats.

Standardized: OCI Distribution Specification

OCI Distribution Specification is an effort to standardize the Docker Registry v2 API. All registries mentioned above should conform to the spec, which means the regsitries follow the same structure, and the same set of client can talk to any of those registries.

Registry Hierarchy

Registry (e.g. Harbor) > Project > Repository > Image

An image is identified by

registry_address/project/repo:tag

Default public project named library.

Search in the registry

OCI based registries don't provide standard APIs to facilitate searching. We can use curl to see the content of the registry:

$ curl --insecure -X GET https://192.168.x.y/v2/_catalog -u user:password

Registry Clients

Registries like Harbor has built-in UI, but no CLI. Use helm or docker to talk to the registry.

Use different cli:

  • docker for docker images
  • oras for OCI artifacts
  • helm for Helm Charts

Manually Delete Artifacts of a Certain Tag

# Get the names of the artifacts
names=$(curl -u user:password https://$HOST/v2/_catalog | jq -r '.repositories | .[]')
tag="tag-to-delete"
for name in $names; do
  to_delete=$(curl -u user:password https://$HOST/v2/$name/tags/list | jq -r "select(.tags[] | contains($tag))")
  if [ -n "$to_delete" ]
  then
    # Cannot delete by tag; can only delete by digest (docker-content-digest).
    digest=$(curl -sSL -I -u user:password \
            -H "Accept: application/vnd.docker.distribution.manifest.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json" \
            https://$HOST/v2/$name/manifests/$tag | awk '$1 == "docker-content-digest:" { print $2 }' | tr -d $'\r')
    # Delete.
    curl -sSL -u user:password -X DELETE https://$HOST/v2/$name/manifests/$digest
  fi
done

Image Pull

Kubelet pulls container images in serial by default. See the --serialize-image-pulls flag in https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#options.

Kubelet emits PullingImage event once the pull request is going into the queue https://github.com/kubernetes/kubernetes/blob/43a2bb4df4ff3c255337f5ed927b8fd17c38c235/pkg/kubelet/images/image_manager.go#L152.

However, the actual containerd image pull request is sent later when the request gets to the top of the queue. https://github.com/kubernetes/kubernetes/blob/43a2bb4df4ff3c255337f5ed927b8fd17c38c235/pkg/kubelet/images/puller.go#L96

It is worth noting that the behavior is configurable through --serialize-image-pulls, but enabling parallel image pull may also overload the node. There is no best solution to limit the parallelism yet, and it is being worked on https://github.com/kubernetes/kubernetes/pull/115220

Registry Mirror

RegistryMirror: e.g. when you try to pull image from docker.io or gcr.io, pull from private registry instead.

Especially useful in air-gapped env (i.e. no internet access)

For example:

$ crictl pull gcr.io/my-project/foo:1.0.0
  • Normally: crictl -> containerd -> gcr.io / docker.io
  • With mirror: crictl -> containerd (reads registrymirror in /etc/containerd/config.toml) -> pull from the mirror registry instead of gcr.io

Note: containerd is different from docker, containerd registry mirror does not work for docker pull.

From Go

Read image index ($PATH/index.json file):

import (
  "github.com/google/go-containerregistry/pkg/v1/layout"
)

index, err := layout.ImageIndexFromPath(path)
if err != nil {
  return fmt.Errorf("failed to load %s as an OCI index: %w", path, err)
}
indexManifest, err := index.IndexManifest()
if err != nil {
  return fmt.Errorf("failed to get IndexManifest: %w", err)
}

// Get annotations
// indexManifest.Annotations