Distributed Logging

Updated: 2020-03-07

Why Logging

  • Debug logging: instead of logging info, warn, error locally, send those logs to a centralized place so the stacktrace can be easily viewed in a dedicated web page, regardless where the code is being executed, whether in your dev server or some random node in staging/prod clusters.
  • System metrics: things like QPS, latency, availability, request counts, etc. These can tell you the system(cluster) health, and not quite related to business logics.
  • Business metrics: especially for billing purposes.

Distributed Log

  • high write availability, and durable record storage
  • repeatable total order on those records.
  • append-only, cannot modify existing records.
  • relatively long lived, the retention can be days or months. It also depends on the privacy policy, PII data may need to be deleted after the retention period, anonymized data may live longer.
  • Record-oriented: data is written into the log in indivisible records, rather than individual bytes.

Products and Solutions

  • Commercial Solutions: Splunk, Sumo Logic
  • Open Source Solutions: Kibana
  • Facebook LogDevice
  • Google Cloud Logging