Storage
3 types
3 types of storage: Block Storage, File Storage, Object Storage (blob=binary large object).
- Block Storage:
- Evenly sized chunks.
- No file systems.
- File Storage:
- A hierarchy of files in folders.
- With file systems.
- Blob / Object Storage:
- Immutable and unstructured objects, e.g. images, audio or other multimedia objects;
- Sometimes binary executable code is stored as a blob.
- No hierachy.
- Access data and metadata using an HTTP API.
Standards:
- block/file storage: POSIX.
- object storage: AWS S3 API is the de facto standard. Most object storage services (Google Cloud Storage), products (e.g. NetApp StorageGRID), open source projects (MinIO) comply with it.
Example of storage systems
- Online Transaction Processing Databases (OLTP)
- Facebook Graph, mission critical, strong consistency, core services
- Semi-online Light Transaction Processing Databases (SLTP)
- Facebook Messages and Facebook Time Series
- Immutable DataStore
- Photos, videos, etc
- Analytics DataStore
- Data Warehouse, Logs storage
Facebook example. This is adapted from this slide
Service | Technology | Bottlenecks | Latency | Consistency | Durability |
---|---|---|---|---|---|
Facebook Graph | MySQL/TAO | Random read IOPS | few ms | quickly consistent across data centers | no data loss |
Messages and Time Series | HBase and HDFS | Write IOPS/storage capacity | < 200 ms | consistent within a data center | no data loss |
Photos / Videos | Haystack | storage capacity | < 250 ms | immutable | no data loss |
Data Warehouse | Hive / Presto / HDFS | storage capacity | < 1min | not consistent across data centers | no silent data loss |
The core of a distributed storage system
- sharding strategy
- metadata storage
Distributed File Systems
Distributed file systems: GFS, Colossus, Alluxio, CephFS, HDFS
- Cluster level, fault tolerant, distributed file systems:
- append only
- not for structured data(use database instead)
- not optimized for small files
- cluster level, not data center level, data destroyed after the cluster turns down
- HDFS is the open source version of GFS(Google File System)
- Colossus is the successor of GFS
- Spanner uses Colossus to store its tablets
Software Defined Storage
- Ceph: by Red Hat; object store at its core, but support all 3 types (object, block, file). For shorter-term stoarge and more frequent user access.
- CephFS: a POSIX-compliant network file system
- Gluster: scalable file storage with object capabilities; also by Red Hat. Should not be used for something transactional, like a database or something that depends on really strict locking.
- Alluxio: a virtual distributed storage system
- MinIO: S3 compatible object storage, k8s native.
- Rook: as a storage orchestrator (can be used with Ceph where Ceph is a storage provider, Rook: Ceph Operator + Discovery)
- VMware vSAN: creates shared storage for VMs.
- HDFS: part of Hadoop.
- Netapp
Thin provisioning
Also known as virtual provisioning or thin storage, is a method of on-demand storage allocation based on user requirements in storage area networks (SAN), centralized storage disks, and storage virtualization systems.
Thin provisioning allows space to be easily allocated to servers, on a just-enough and just-in-time basis.