Distributed Systems - Storage

Updated: 2019-01-27

Storage

3 types of storage: block storage, file storage, object storage(blob=binary large object)

The unit of these 3 types:

  • block storage: evenly sized chunks.
  • file storage: a hierarchy of files in folders
  • blob/object storage: immutable objects

Images, audio or other multimedia objects, though sometimes binary executable code is stored as a blob.

Amazon's Offerings

  • S3: object storage
  • EBS: block storage
  • EFS: file storage

EBS and S3

  • EBS: high latency(comparing to databases)
  • EBS is mountable storage; it can be mounted as a device to an EC2 instance, NAS
  • S3 not mountable, a storage service, not a device
  • EBS can be thought of as external hard drive, while S3 is more akin to DropBox
  • Glacier is S3 for archived files.

Google's Offerings

  • Google Cloud Storage: object storage
  • Google Cloud Filestore: file storage
  • Persistent Disk: block storage

Redhat's Offerings

https://www.redhat.com/en/topics/data-storage

  • Gluster: file storage
  • Ceph: object/block/file storage

    • CephFS: a POSIX-compliant network file system

Distributed File Systems

  • Cluster level, fault tolerant, distributed file systems:

    • append only
    • not for structured data(use database instead)
    • not optimized for small files
    • cluster level, not data center level, data destroyed after the cluster turns down
  • HDFS is the open source version of GFS(Google File System)
  • Colossus is the successor of GFS
  • Spanner uses Colossus to store its tablets

Open-source Projects

NAS vs. DAS

  • DAS: Directly Attached Storage.
  • NAS: Network Attached Storage.