Versus
    32-bit vs 64-bit
    Annotations vs Decorators
    BigQuery vs Bigtable
    Block Storage vs File Storage vs Object Storage
    C vs C++
    Canvas vs SVG
    Constructor vs Init() vs Factory
    Containers vs Virtual Machines (VMs)
    DOM vs Virtual DOM vs Shadow DOM
    DQL vs DDL vs DCL vs DML
    Dagger vs Guice
    Data Mining vs Machine Learning vs Artificial Intelligence vs Data Science
    Flux vs Redux
    GCP API Gateway vs Cloud Endpoint
    GCP Cloud Run vs Cloud Functions vs App Engine
    GCP DataFlow vs Dataproc
    Google Analytics 4 vs Universal Analytics
    Google Internal vs Open Source
    HEIC vs HEIF vs HEVC vs JPEG
    Java vs C++
    Jetty vs Netty
    Kotlin vs Java
    LLVM vs JVM
    Linux vs BSD
    Microcontroller vs Microprocessor vs Computer
    Node.js vs Erlang
    POSIX vs SUS vs LSB
    Pass-by-value vs Pass-by-reference
    Proto2 vs Proto3
    PubSub vs Message Queue
    REST vs SOAP
    React vs Flutter vs Angular
    Rust vs C++
    SLI vs SLO vs SLA
    SRAM vs DRAM
    SSD vs HDD
    Software Engineer vs Site Reliability Engineer
    Spanner vs Bigtable
    Stack based VM vs Register based VM
    Stateless vs Stateful
    Static Site Generation vs Server-side Rendering vs Client-side Rendering
    Strong Consistency vs Eventual Consistency
    Subroutines vs Coroutines vs Generators
    Symlinks vs Hard Links
    Tensorflow vs PyTorch
    Terminal vs Shell
    Vi vs Vim vs gVim vs Neovim
    WAL vs rollback journal
    gtag vs Tag Manager
    stubs vs mocks vs fakes

Software Engineer vs Site Reliability Engineer

Updated: 2021-12-07

(Note: different companies have different definitions of these roles. This doc describes just one typical case. And SRE may be called DevOps or Production Engineer instead.)

TL;DR

  • Software Engineer (SWE): owns the design and implementation of the system; hands over the compiled binaries to SRE to run in production;
  • Site Reliability Engineer (SRE): owns the binaries running on servers in production; treat binaries as blackboxees.
  • There are overlaps in responsibilities, SWE and SRE need to work closely together

The development workflow

Some improvements to production systems may happen within the SRE org, in this case SRE are just like SWE (owns the design and the implementation). Here we discuss a business related project.

SLOs, Metrics, Monitoring

SWEs and SREs need to collaboratively define the SLOs and the metrics, e.g. latency.

Both SWEs and SREs should be familiar with the monitoring tools.

Design Docs

SWE owns the design. However the design doc often needs to be reviewed and approved by an SRE, to make sure the production systems can handle this new change (e.g. if SLOs can still be met, if extra capacity is required).

Implementation

This is SWE's responsibility.

Build, Test, Release

SWE needs to make sure that the new change can successfully build and all tests pass.

SREs are responsible for the build and release tools.

Oncall and Incident Response

One of SRE's main responsibility. If it is a production issue, SREs can take actions (e.g. reboot, redirect traffic, get extra capacity, rollback, etc); if it is a bug caused by the new change, SREs will reach out to SWEs (since SREs do not own the logic and treats binaries as blackboxes).

Sometimes SWEs are fully responsible for oncalls for a new system, only when it gets matured will SWEs hand over oncalls to SREs (with a well-written playbook, describing proper reactions in different failure scenarios).

Others

SRE teams may be geographically distributed, in order to cover oncalls 24/7. It is not a must for SWE teams.