Google vs Open Source
Google heavily relies on a vast array of open-source projects, both those it initiated and those led by the wider community. This dependence is fundamental to its infrastructure, products, and services. In fact, Alphabet (Google's parent company) depends on thousands of upstream open-source projects and communities to run its operations.
Here are some of the key open-source projects Google depends on:
Google-Initiated Projects (and heavily depended upon)
- Kubernetes (k8s): A cluster management system for managing containerized applications across multiple hosts. Kubernetes was originally designed by Google and is a cornerstone of Google Cloud's container offerings.
- Android: An open-source operating system and software stack that forms the basis for a vast ecosystem of mobile, wearable, TV, and automotive devices. Google extensively uses and depends on Android for its mobile presence.
- Chromium: The open-source web browser project that forms the foundation of Google Chrome and many other browsers. Chromium's source code is used to build Google Chrome, which is used daily by many Google employees.
- TensorFlow: An end-to-end open-source platform for machine learning, developed by the Google Brain team. It's deeply integrated with Google's own ML infrastructure, including TPUs and Google Cloud Platform.
- Go: An open-source programming language designed at Google, widely used for building scalable and efficient software, including within Google's own infrastructure.
- Apache Beam: A unified programming model for batch and streaming data processing, originally open-sourced by Google. Alphabet employees are significant contributors to Apache Beam.
- gRPC: A high-performance, open-source universal RPC framework developed by Google. Alphabet employees also contribute to gRPC.
- LevelDB: A fast, lightweight key-value storage library developed by Google engineers, forming the foundation behind many NoSQL systems.
- FlatBuffers: A cross-platform serialization library created by Google for high-performance applications, especially in areas like games.
- Google Test (googletest): A C++ testing framework created at Google, which is widely adopted and used by major open-source projects like Chromium and TensorFlow.
- Angular: An open-source web application framework led by Google's Angular Team, used for building web, mobile, and desktop applications.
- Flutter: A free and open-source UI toolkit by Google for building natively compiled applications for mobile, web, and desktop from a single codebase.
- Firebase: An app development platform (purchased by Google) with many open-source components, leveraging Google's infrastructure.
- Bazel: Google's own fast, scalable, multi-language, and extensible build system, which is open-source.
Community-Led Projects (that Google depends on and contributes to)
- Envoy: A high-performance open-source proxy, often used as a service mesh or edge proxy. Envoy is a community-led project that Alphabet employees contribute to.
- LLVM: A collection of modular and reusable compiler and toolchain technologies. LLVM is a community-led project that Alphabet employees contribute to.
- Linux kernel: The foundation of the Android operating system and a critical component for Google's server infrastructure. Google is a significant contributor to the Linux kernel.
- GCC (GNU Compiler Collection): A set of compilers that Google contributes to, essential for building many software projects.
- web-platform-tests: A large and growing suite of tests for the web platform, developed collaboratively by browser vendors (including Google) and other stakeholders.
- OSS-Fuzz: While initiated by Google, it's a free fuzzing-as-a-service platform that Google provides for popular open-source projects, which Google itself benefits from by improving the security of its upstream dependencies.
- Open Source Insights: A Google project designed to help developers (including those at Google) better understand the structure and security of the software they use by scanning and visualizing dependency graphs for millions of open-source packages.
Google Internal Version vs Open Source Version
Blaze and Bazel
You can tell they are related from the names.
- Blaze: Google's internal build system.
- Bazel: the open source version.
Borg and Kubernetes
- Borg: Google's internal cluster manager.
- package format: MPM.
- Kubernetes: modeled after Borg. Written in Go.
- package format: containers.
Public info about Borg and MPM:
- Borg: https://ai.google/research/pubs/pub43438
- MPM: https://www.usenix.org/conference/lisa14/conference-program/presentation/mcnutt
Knative vs Cloud Run
- Knative is an open-source serverless platform on top of Kubernetes, now part of CNCF.
- Cloud Run is a managed Knative service.
Flume vs Apache Beam
- Flume: Google's internal data pipelines; originally on top of MapReduce, now moved to a faster underlying engine.
- Apache Beam: the open-source version of Flume.
Google's Flume is totally different from Apache Flume.
Stubby vs gRPC
- Stubby: Google's internal RPC framework.
- gRPC: the next generation of Stubby, and open sourced.
Both depend on protobuf.
GFS and HDFS
- GFS: Google's (previous generation) distributed file system, and Google published a paper about it.
- HDFS: part of Hadoop, an open source implementation of that paper.