Tech Stacks
    Overview
    System Design Patterns
    CAP Theorem
    C10K and C10M
    Network Programming Models
    Infrastructure as Code
    Examples

Databases - Spanner and Related Databases

Updated: 2022-03-07

Spanner

Spanner is a globally-scalable database system used internally by Google; it is the successor of the BigTable database. (Link to the paper)

Cloud Spanner is the managed database on Google Cloud Platform.

Similar to BigTable, Spanner also uses SSTable; however it starts to migrate to use a columnar format instead.

Spanner does not have auto-increment key; do not use numbers in incremental order as keys, including timestamps, because Spanner is distributed and sharded by key, such keys will result in hotspots and hurt performance.

Natively support ProtoBuf.

Cloud Spanner manages splits using Paxos.

Data model

  • A Spanner database have one or more tables.
  • Tables are the same as in other relational database tables: rows, columns and values. Data is strongly typed.
  • One or more primary keys.
  • Can define one or more secondary indexes.
  • Support table interleaving and foreign keys.

When you committed the writes to the Spanner database, the system versioned each item of data and associated it with a specific commit timestamp. This means the next time you update an item of data, the old version of that data can still be read (subject to garbage collection limits), and the new version will be assigned a timestamp that's guaranteed to be greater than the timestamp of the old version. This allows Spanner clients to read current values of the data (aka "strong reads") and older values (using a read at a timestamp or a bounded stale read) within a certain bound (e.g. the past ~4 hours).

TrueTime

TrueTime is a highly available, distributed clock that is provided to applications on all Google servers. TrueTime enables applications to generate monotonically increasing timestamps.

Achieving Eventual Consistency Performance

Spanner provides stale reads, which offer similar performance benefits as eventual consistency but with much stronger consistency guarantees. A stale read returns data from an "old" timestamp, which cannot block writes because old versions of data are immutable.

Proto

Spanner supports a PROTO<...> type, which allows for the storage of structured data using a user-defined protocol buffer type.

  • with BLOB: Opaque protos, not validated at write time, queries might return null at query time for data that does not match the proto definition.
  • without BLOB: CREATE PROTO BUNDLE required to validate message contents on writes; also enables use of fields in that proto type in Spanner SQL queries.

Index

In Cloud Spanner, indexes are actually implemented using tables, which allows them to be distributed and enables the same degree of scalability and performance as normal tables.

However, because of this type of implementation, using indexes to read the data from the table row is less efficient than in a traditional RDBMS. It’s effectively an inner join with the original table,

using an index in Cloud Spanner is always a trade-off between improved read performance and reduced write performance.

Spanner Inspired Databases

  • CockroachDB
  • YugaByteDB

CockroachDB

https://www.cockroachlabs.com/

An open source version of Google Spanner. CockroachDB is a distributed database architected for modern cloud applications. It is wire compatible with PostgreSQL.

CockroachDB is backed by RocksDB, an embedded key-value store, or a purpose-built derivative, called Pebble. Though RocksDB is from Facebook, but it is based on LevelDB, which was also from Google.

CockroachDB is implemented in Go.

CockrachDB deprecated interleaving tables and indexes in v20.2. Saying it is much slower than scanning over tables and indexes with no child objects, and database schema changes are slower for interleaved objects. https://www.cockroachlabs.com/docs/v21.1/interleave-in-parent#deprecation

Pebble: KV engine; replace RocksDB as the default storage engine in Cockroachdb; a subset of rocksdb; also LSM-Tree; Pebble does not aim to be a complete replacement for RocksDB, but only a replacement for the functionality in RocksDB used by CockroachDB. https://www.cockroachlabs.com/blog/pebble-rocksdb-kv-store/ https://github.com/cockroachdb/pebble