GCP - Compute (GCE, GKE, and Cloud Run)

On Google Cloud, the three most common choices are Google Compute Engine (GCE), Google Kubernetes Engine (GKE), and Cloud Run.

Google Compute Engine (GCE)

What it is: GCE provides you with a raw virtual machine (VM). Think of it as Google handing you a plot of land and a set of keys to a powerful, empty server.

You are in complete control. You choose the operating system (Linux or Windows), you decide the exact CPU and RAM, you install the web server, you configure the firewall, and you are responsible for every single piece of software on it.

When to choose GCE

You need maximum control. Your application has very specific, non-standard operating system requirements or needs access to specialized hardware.
You are "lifting and shifting." You have an existing application running on a physical server in your office, and you want to move it to the cloud with the fewest changes possible.
You are running legacy software. You have an application that wasn't designed for modern, containerized environments.

The trade-off? With great power comes great responsibility. You are the builder, the plumber, and the security guard. All maintenance, patching, and scaling is on you.

Under the hood of GCE

Every virtual machine (VM) instance stores its metadata on a metadata server. Your VM automatically has access to the metadata server API without any additional authorization. Metadata is stored as key:value pairs.

Service Account

When you create a GCE instance, you typically attach a service account to it (either the default Compute Engine service account or a custom one). Any applications running on that VM that use Google Cloud client libraries will, by default, use the credentials of this attached service account.

Google Kubernetes Engine (GKE)

What it is: GKE is Google's managed version of Kubernetes, the open-source standard for running containerized applications at scale.

You package your application into standardized boxes called containers, and then you tell Kubernetes, "I need 5 of these web server containers and 2 of these database containers."

Kubernetes, orchestrated by GKE, handles the rest. It automatically finds a place for your containers, ensures they are always running, and can instantly scale up to handle a massive influx of traffic.

When to choose GKE

You need to run complex, multi-component applications (microservices).
Scalability and reliability are your top priorities. GKE can handle massive, global-scale applications.
You want to avoid vendor lock-in. Kubernetes is an open-source standard, meaning you can take your application and run it on other clouds or even on-premise with minimal changes.

The trade-off? You no longer manage individual machines, but you now manage a powerful orchestration system. While GKE simplifies Kubernetes, there is still a learning curve. You need to understand concepts like Pods, Deployments, and Services.

Under the hood of GKE

GKE "Nodes" are the individual (GCE) VM instances that make up the cluster and run your containerized workloads.

GKE Networking: GKE migrated from calico (dataplane v1, based on iptables) to cilium (dataplane v2, based on eBPF, pod: anetd). As packets arrive at a GKE node, eBPF programs installed in the kernel decide how to route and process the packets. Unlike packet processing with iptables, eBPF programs can use Kubernetes-specific metadata in the packet. This lets GKE Dataplane V2 process network packets in the kernel more efficiently and report annotated actions back to user space for logging.

GKE Dataplane V2 does not use kube-proxy: GKE Dataplane V2 uses cilium instead of kube-proxy to implement Kubernetes Services.

https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2#kube-proxy

GKE vs Open Source K8S

GKE additional features comparing to open source k8s: networking, backup, auth, configmanagement, addon (log, metric), configsync

Is gke gateway/GKE Inference Gateway running inside a k8s cluster?

No, that gateway is running on GCP, not on Pods inside your k8s cluster.

Cloud Run

What it is: Cloud Run is the ultimate in simplicity and efficiency. It is a "serverless" platform, which means you stop thinking about servers altogether.

You simply package your code into a container, give it to Cloud Run, and say, "Run this." Google handles everything else. There are no virtual machines to manage, no operating systems to patch, and no Kubernetes clusters to configure.

Best of all, Cloud Run can scale down to zero. If no one is using your application, you pay nothing. The moment a request comes in, it instantly scales up the required number of instances to handle the load.

When to choose Cloud Run

You are building web applications, APIs, or microservices.
Your traffic is unpredictable or "bursty." It’s perfect for applications with long idle periods punctuated by sudden spikes in activity.
You want the fastest development velocity. It allows your developers to focus only on writing code, not managing infrastructure.
Cost-efficiency is critical. You only pay for the exact compute time your code uses, down to the millisecond.

The trade-off? You give up control over the underlying infrastructure. You can't specify the OS or do low-level network configuration. Your application must be stateless and packaged in a container.

Under the hood of Cloud Run

Unlike GKE, Cloud Run does not run directly on GCE instances.

The 1st gen Cloud Run was based on gVisor, but the 2nd gen is using the same sandbox technology as the one used for GCE.