Kubernetes - Storage
There are two components to storage.
- system storage: stored locally, on the master nodes (e.g., etcd, keys, certificates) and on the worker nodes (e.g., logs, metrics).
- Etcd: fault-tolerance can be achieved either through master replication (i.e., running multiple masters, each using non-fault-tolerant (local) storage) or by a single master writing to / reading from fault-tolerant storage.
- Keys and certificates, Audit logs: require encryption and restricted mutability.
- System logs (e.g. Fluentd) metrics (e.g. Prometheus): may not require fault tolerant storage as they are usually exported to Cloud and typically need storage for local buffering only (e.g., to cover up to 24h of network unavailability).
- application storage: requires CSI drivers for customer-provided external storage. Options:
- use pre-existing fault-tolerant on-prem storage solutions like NetApp or EMC
- use a storage solution on top of a K8s cluster.
- fault-tolerant K8s-managed storage: e.g. Ceph, EdgeFS, etc.
- non-fault-tolerant: e.g. Persistent Local Volumes.
- fault tolerance (persisted state must be durable) and
- bootstrapping (storage must be available even before the cluster control plane is fully operational)
- Standard Kubernetes volume primitives:
- Backed by local disks
- Manage sharing via
Podephemeral-storage requests/limits, node allocatable
- Standard Kubernetes persistent volume primitives:
VolumeSnapshot(requires CSI driver support).
- k8s -> Trident -> ONTAP
- k8s -> Rook -> Ceph
Cloud big 3:
- Amazon EBS
- Google Persistent Disk
- Azure Disk Storage
- NetApp Trident
- Red Hat Container Storage Platform
- MayaData Kubera
Traditional Storage Vendors:
- Dell EMC
- Pure Storage
- HPE Storage
Open Source Projects
Backend technology or protocols
Using CSI, third-party storage providers can write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code.
CSI is a spec. a standard for exposing arbitrary block and file storage storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes. k8s has its own CSI implementation.
CSI driver: as
StorageClass; PVC reference the StorageClass in
kind: StorageClass provisioner: csi-driver.example.com
Pod to PVC:
kind: Pod spec: volumes: - name: foo persistentVolumeClaim: claimName: my-request-for-storage
- Kubelet directly issues CSI calls (like
NodePublishVolume, etc.) to CSI drivers via a Unix Domain Socket to mount and unmount volumes.
- Kubelet discovers CSI drivers (and the Unix Domain Socket to use to interact with a CSI driver) via the kubelet plugin registration mechanism.
- Kubernetes master components do not communicate directly (via a Unix Domain Socket or otherwise) with CSI drivers. Kubernetes master components interact only with the Kubernetes API.
NetApp Harvest: The default package collects performance, capacity and hardware metrics from ONTAP clusters. https://github.com/NetApp/harvest
Trident is an external provisioner controller:
- run as a k8s pod or deployment;provides dynamic storage orchestration services for your Kubernetes workloads.
- monitors activities on PVC / PV / StorageClass
- a single provisioner for different storage platforms (ONTAP and others)
- Trident CSI driver talks to ONTAP REST API
Trident interacts with k8s (from Trident official doc)
- A user creates a
PersistentVolumeClaimrequesting a new
PersistentVolumeof a particular size from a Kubernetes
StorageClassthat was previously configured by the administrator.
- The Kubernetes
StorageClassidentifies Trident as its provisioner and includes parameters that tell Trident how to provision a volume for the requested class.
- Trident looks at its own Trident
StorageClasswith the same name that identifies the matching
StoragePools that it can use to provision volumes for the class.
- Trident provisions storage on a matching backend and creates two objects: a
PersistentVolumein Kubernetes that tells Kubernetes how to find, mount and treat the volume, and a
Volumein Trident that retains the relationship between the PersistentVolume and the actual storage.
- Kubernetes binds the
PersistentVolumeClaimto the new
PersistentVolume. Pods that include the
PersistentVolumeClaimwill mount that
PersistentVolumeon any host that it runs on.