Over the years, different kinds of databases were developed for different use cases. DB-Engines is tracking more than 300 databases. Choose wisely.
- relations: between rows and columns(a relation is a table, not between tables)
- a unique key identifying each row.
It is a safe bet to start With MySQL or PostgreSQL. Many mobile apps rely on SQLite.
MySQL is the top one open-source db on DB-Engines, also proved its scalability at Facebook, and Uber migrated from Postres to MySQL
MariaDB is a fork of MySQL and it is gaining popularity quickly.
Google and Amazon have their home-grown relational database. All 3 biggest clouds managed popular databases.
- Amazon Aurora: Amazon's own relational database
- Amazon RDS: Amazon managed MySQL/PostgreSQL/MariaDB
- Google Cloud Spanner: Google's own relational database
- Google Cloud SQL: Google managed MySQL/PostgreSQL
- Azure Database for MySQL/PostgreSQL/MariaDB: Microsoft managed db.
MongoDB is probably the most popular open source NoSQL database. DynamoDB is the one shining in the cloud.
There are many open source db under Apache, some of them are implementations of the famous papers published by companies like Google
- Amazon DynamoDB
- Google Cloud Bigtable
- Azure CosmosDB
- MongoDB Atlas: MongoDB's own cloud offering https://www.mongodb.com/cloud/atlas/
Relational databases can function as general purpose graph databases, but they are not very efficient to traverse the graphs. Multiple queries and joins are often needed.
E.g. Facebook created its own huge social graph, every entity is a node(like a person, a business, a page, a group), and the different types of relationships are the edges. It is backed by TAO, which is actually a caching layer over MySQL.
Graph databases you can use if you choose not to build it in-house like Facebook:
- Neo4j: a Java graph db.
- JanusGraph: started as a ford of TitanDB(now TitanDB is discontinued). Supported by Google.
- Amazon Neptune
- graph model: Property Graph and W3C's RDF,
- graph query: Apache TinkerPop Gremlin and SPARQL
- Giraph: based on Google's Pregel, however Pregel is deprecated.
Choose Redis for cache for all new projects. Though Memcache is used extensively at Facebook.
Redis: not often used as a primary data store, but for storing and accessing ephemeral data(loss can be tolerated) – metrics, session state, caching.
Some databases are optimized for flash, making them cheaper alternatives to caches. E.g. Aerospike
key-value stores, by design, have trouble linking values together (in other words, they have no foreign keys).
- Amazon ElastiCache
- Google Cloud Memorystore
- Azure Cache for Redis
Read more in the cache page.
Especially useful for:
- DevOps Monitoring
- IoT Applications
- Real-time analytics
- Amazon Timestream
- InfluxDB: https://www.influxdata.com/time-series-platform/influxdb/
Solr is losing popularity to Elasticsearch.
- Amazon CloudSearch
- Microsoft Azure Search
- Google Search Appliance
Read more from the Data Warehouse page.
Often used in web browsers and mobile apps.
- the whole database is in one single file on disk, can be embedded inside the application, very portable.
- zero-config, easy to setup (e.g. no
GRANT, use file permissions)
- not a client–server database, no network capabilities, cannot be used over a network.
- not for write-intensive deployments, not for high concurrency use case, since it relies on file-system locks, versus server-based databases handle locks in daemons
- no type checking, the type of a value is dynamic and not strictly constrained by the schema
- no user management; no way to tune for higher performance
- not that reliable, comparing to other RDBMS like MySQL
- row key: UUID, equivalent to “primary key"
- column key: string
- ref key: version number(the highest is the latest)
Uses vector lock to resolve write conflicts
Spanner is a globally-scalable database system used internally by Google; it is the successor of the BigTable database. (Link to the paper)
Cloud Spanner is the managed database on Google Cloud Platform.
As a successor of BigTable, Spanner also uses SSTable; however it starts to migrate to use a columnar format instead.
Spanner does not have auto-increment key; do not use numbers in incremental order as keys, including timestamps, because Spanner is distributed and sharded by key, such keys will result in hotspots and hurt performance.
Natively support ProtoBuf.
An open source version of Google Spanner, bigtable=>spanner(google)=>CockroachDB(out of google)
CockroachDB uses RocksDB, an embedded key-value store, internally. Though RocksDB is from Facebook, but it is based on LevelDB, which was also from Google.
Based on based on the Bigtable technology from Google
- modeled after google’s bigtable
- HBase features compression, in-memory operation, and Bloom filters on a per-column basis as outlined in the original Bigtable paper.
- In the parlance of Eric Brewer’s CAP Theorem, HBase is a CP type system.
- LevelDB(created by Google): inspired by Bigtable, also use SSTable.
- RocksDB(created by Facebook): a fork of LevelDB
- better compression
- less write-amplification
How it works:
- SSTables are arranged in several levels.
- SSTables are non-overlapping within a level, e.g. 2 levels:
- level 1: 2 SSTables, one with key space
[a, b), and another
- level 2: single SSTable with key space
- the query with a string starting with
awill look at
[a,c)on level 2 and
[a, b)on level 1. Since the strings are sorted, it takes
- level 1: 2 SSTables, one with key space
Unlike either monolithic or master-slave designs, Cassandra makes use of an entirely peer-to-peer architecture. All nodes in a Cassandra cluster can accept reads and writes, no matter where the data being written or requested actually belongs in the cluster.
- Shard data automatically
- Handle partial outages without data loss or downtime
- Scales close to linearly
- High write throughput
- Horizontal scalability
- Automatic Failover
- Strong consistency within a data center
Five-dimensional data model(similar to Bigtable or Cassandra):
- namespaces(databases): contain sets (tables) of records
- sets: table
- key: identify records
- metadata: generation tracks record modification, time-to-live (TTL) specifies record expiration
- bin: name value pairs.
- Data is sharded and balanced between servers using a Paxos-based membership algorithm.
- Aerospike makes a dangerous assumption for a distributed datastore: it assumes the network is reliable
Comparing to in memory cache:
- optimized for flash storage
- cheaper than the in memory cache
- no need to re-load data into memory after outrage
- append only: data is virtually incorruptible and easy to replicate, back up, and restore.
- not for ad hoc queries
- multiple masters
Data is continuously backed up to S3 in real time, with no performance impact.
Instead of creating an index tree for each column, Cosmos DB employs one index for the whole database account
- InnoDB: B-tree
- RocksDB: LSM(log-structured merge tree)
- We have found RocksDB, when compared to InnoDB, uses less storage space to store an UDB instance and generates less write traffic overall
- A B-Tree wastes space when pages fragment. An LSM doesn't fragment.
as cardinality moderately increases, InfluxDB performance drops dramatically due to its reliance on time-structured merge trees (which, similar to the log-structured merge trees it is modeled after, suffers with higher-cardinality datasets). This of course should be no surprise, as high cardinality is a well known Achilles heel for InfluxDB
- Extraction: Stream data to key-value maps or other simplified forms, so that serving systems or others can consume them.
- Transformation: Do additional computation and write the augmented data back to the same database.