logo

PostgreSQL

Last Updated: 2024-03-03
  • Postgres uses process instead of thread, connection pool is used to prevent frequent creating and closing processes.
  • Different languages/framework have different modules with different quality to manage the connection pool, which often causes problems. Use middleware.
  • Postgres can easily handle 10k insertions per second, which can replace a dedicated pub/sub server like Kafka.

Caching

https://madusudanan.com/blog/understanding-postgres-caching-in-depth/

Understanding caching in Postgres - An in-depth guide: Postgres is a process based system, i.e each connection has a native OS process...Postgres as a cross platform database, relies heavily on the operating system for its caching...WAL is a redo log that basically keeps track of whatever that is happening to the system...It is always better to monitor something directly from postgres,rather than going through the OS route.

PostgreSQL vs MySQL

  • Postgres limitations:
    • Inefficient architecture for writes;
    • Inefficient data replication;
    • Issues with table corruption;
    • Poor replica MVCC support;
    • Difficulty upgrading to newer releases.
  • MySQL wins:
    • The most important architectural difference is that while Postgres directly maps index records to on-disk locations, InnoDB maintains a secondary structure. Instead of holding a pointer to the on-disk row location (like the ctid does in Postgres);
    • MySQL supports multiple different replication modes; InnoDB storage engine implements its own LRU in something it calls the InnoDB buffer pool; MySQL implements concurrent connections by spawning a thread-per-connection. This is relatively low overhead.

PostgreSQL derived or compatible databases

Many databases are building on top of postgres:

  • PostGIS: for geo spatial data
  • TimescaleDB: for time-series data
  • Greenplum (from VMware): for analytics
  • Netezza (from IBM): data warehouse
  • AgensGraph: graph database
  • Redshift (AWS): data warehouse

Postgres Compatible (using Postgres' API / SQL):

  • Google Cloud Spanner
  • CockroachDB

TimescaleDB

  • https://www.timescale.com/
  • TimescaleDB is packaged as a PostgreSQL extension
  • TimescaleDB is more than 20x faster than vanilla PostgreSQL when inserting data at scale
  • TimescaleDB enables you to scale to 1 billion rows with no real impact to insert performance

https://blog.timescale.com/scalable-postgresql-high-availability-read-scalability-streaming-replication-fb95023e2af

Used by Other Projects

E.g. harbor, gitlab.