Tech Stacks - Logging

Why Logging

Debug logging: instead of logging info, warn, error locally, send those logs to a centralized place so the stacktrace can be easily viewed in a dedicated web page, regardless where the code is being executed, whether in your dev server or some random node in staging/prod clusters.
System metrics: things like QPS, latency, availability, request counts, etc. These can tell you the system(cluster) health, and not quite related to business logics.
Business metrics: especially for billing purposes.

Distributed / Centralized Logging

high write availability, and durable record storage
repeatable total order on those records.
append-only, cannot modify existing records.
relatively long lived, the retention can be days or months. It also depends on the privacy policy, PII data may need to be deleted after the retention period, anonymized data may live longer.
Record-oriented: data is written into the log in indivisible records, rather than individual bytes.

Products and Solutions

Commercial Solutions: Splunk, Sumo Logic
Open Source Solutions: Kibana
Facebook LogDevice
Google Cloud Logging
AWS Central Logging
OpenTelemetry

Different Aspects

machine logs vs user logs
real-time vs historical
collected logs vs processed logs(sessionization, normalization, anonymization)
debug logs (INFO/WARNING/ERROR) vs event logs(revenue logs, click logs)
logs: retention, wipeout, takeout

Versioned Process Logs

Versioned directories (e.g. /prefix/YYYY/MM/DD/<version>/) should be used.

For Log Consumer

With non-versioned directories: an analysis job is reading log files, the data processing pipeline just finishes and updates the old data in place, the analysis job ends up reading half old files and half new files, with duplicated data or missing data.

With versioned directories: the analysis job can keep reading the old files until it finishes to get a consistent result, if it runs again it can pickup the new data.

For Log Producer

With non-versioned directories: the new logs need to be generated in place, or in another directory and be copied over.

With versioned directories: the new logs are generated in the new version directory, and the directory can be marked as ready or live and made visible to consumers. And it is easier to roll back.

Different logs

remote debug logs (server logs), info, fatal, error
query log
production change logs: rollouts, commandline param change
trace
binary crash logging