gVisor
gVisor is an open-source "application kernel" developed by Google. It provides a layer of isolation between a containerized application and the host operating system kernel, designed to make containers as secure as virtual machines (VMs) without the heavy resource overhead.
In the context of AI, gVisor is increasingly vital for running untrusted models, executing AI-generated code, and securing multi-tenant GPU environments.
How gVisor Works
In a standard container (like Docker), the application shares the host's Linux kernel. If the application is compromised, it could exploit a kernel vulnerability to "break out" and take over the entire server. gVisor prevents this by intercepting system calls.
The Core Components:
- The Sentry (The "User-Space Kernel"): This is the heart of gVisor. Written in Go (a memory-safe language), it acts as a fake kernel that lives in user space. When an application tries to make a system call (like
open()a file ornetwork()a socket), the Sentry intercepts it and handles it internally rather than letting it reach the host kernel. - The Gofer (File System Proxy): To keep the Sentry even more isolated, it isn't allowed to access the file system directly. Instead, it talks to a separate process called the Gofer, which fetches files on its behalf.
- Platforms (Interception): gVisor uses different mechanisms (like
ptraceorKVM) to redirect system calls from the application to the Sentry.
The result: The host kernel only sees a small, predictable set of system calls from gVisor itself, rather than the hundreds of potentially dangerous calls from the untrusted application.
Where is it used?
gVisor is used in environments where security and multi-tenancy are priorities:
- Google Cloud: It powers GKE Sandbox, more details below.
- SaaS Platforms: Companies that let users upload and run their own code (like CI/CD pipelines or online IDEs) use gVisor to ensure one user's code can't spy on another's.
- Financial & Healthcare: Organizations running sensitive workloads in Kubernetes use it as an extra layer of "defense in depth."
gVisor in Google Cloud
While Cloud Run v2 no longer uses gVisor, gVisor still powers the "Defense in Depth" for several other services:
- Cloud Run (First Generation): If you specifically select the "First Generation" execution environment in Cloud Run settings, your code still runs inside gVisor.
- App Engine Standard Environment: Most runtimes (Python, Node.js, Go, etc.) in App Engine Standard use gVisor to isolate user code from the host.
- GKE Sandbox: This is the most common place for developers to use gVisor manually. It allows you to run a "Sandboxed" node pool where every Pod is wrapped in gVisor.
- Cloud Functions (1st Gen): The original Cloud Functions architecture uses gVisor. (2nd Gen Cloud Functions are built on Cloud Run v2 and thus use microVMs).
Why Cloud Run v2 moved away from gVisor?
The reason Cloud Run v2 moved to microVMs is compatibility. gVisor has to "re-implement" every Linux system call in Go; if your app uses a niche system call that gVisor hasn't written yet, the app crashes. MicroVMs run a real Linux kernel, so they "just work."
The Big Shift: gVisor in AI
gVisor has found a massive second life in the AI world because it solves a specific problem that microVMs struggle with: GPU-accelerated sandboxing.
A. GKE Sandbox for AI Agents
Google recently introduced "GKE Sandbox for Agents." When an AI agent (like a LangChain agent) generates and executes Python code, that code is "untrusted." If it runs on a standard container, a "hallucinated" or malicious command could potentially escape to the host.
- The AI Use Case: GKE uses gVisor to create an ephemeral sandbox for that specific code execution. If the AI-generated code tries to wipe the server or scan the internal network, gVisor intercepts the system calls and blocks them.
B. GPU Isolation (nvproxy)
Until recently, you couldn't easily use a GPU inside a gVisor sandbox because GPUs require direct kernel access. Google solved this with nvproxy:
- How it works: gVisor intercepts CUDA and NVIDIA driver calls from the application and proxies them safely to the host GPU driver.
- Why it matters for AI: This allows multi-tenant AI platforms to share expensive GPUs between different customers. Each customer’s training or inference job is isolated by gVisor, preventing one user from accessing another user’s model weights or data in GPU memory.
C. Industry Usage (OpenAI & Anthropic)
Because gVisor is open-source, it is used heavily outside of Google Cloud by the world's leading AI companies:
- OpenAI: Uses gVisor for high-risk tasks, specifically sandboxing code execution within ChatGPT (e.g., the Advanced Data Analysis feature).
- Anthropic: Is a major contributor to the gVisor project and uses it to isolate their internal AI research environments.