gVisor - Filesystem
gVisor is a "user-space kernel," so it essentially re-implements the entire Linux kernel's logic in Go. Because of this architecture, its handling of files and inodes is split between its own internal "virtual" world and the real host world.
The filesystem in gVisor is designed with a "Security-First" approach. In a standard container (runc), the process speaks directly to the Linux kernel to access files. In gVisor, the Sentry (guest kernel) is forbidden from touching the host’s files directly.
To solve this, gVisor uses a multi-process architecture involving the Sentry, the Gofer, and a specialized communication protocol.
The Core Architecture: Sentry vs. Gofer
gVisor splits filesystem work between two entities to ensure that even if an attacker breaks the Sentry, they still can't wander around the host's hard drive.
- The Sentry (The Brain): Manages the Virtual File System (VFS). It tracks which files are "open" inside the container, manages absolute vs. relative paths, and handles permissions. However, it cannot physically "open" a file on your SSD.
- The Gofer (The Hands): A separate host process started for each mount point. It has the actual "permission" to talk to the host OS.
The Step-by-Step Flow: Opening a File
When an application inside gVisor calls open("/data/config.txt"), here is the chain of events:
- Intercept: The Sentry intercepts the
opensyscall. - VFS Lookup: The Sentry checks its internal VFS tree to see where
/datais mounted. - The Request: The Sentry sends a message (over a Unix socket) to the Gofer process. It says: "Please give me a handle for 'config.txt' inside your root directory."
- Host Open: The Gofer process, running on the host, performs the actual
open()syscall to the Host Linux Kernel. - FD Passing: If successful, the Gofer receives a File Descriptor (FD) from the host kernel. It then "passes" this FD back to the Sentry using a specialized Linux technique: SCM_RIGHTS.
- Usage: The Sentry now has a direct FD to the host file. When the app calls
read(), the Sentry can often read directly from that FD to save time.
Types of Filesystems in gVisor
Not everything goes through the Gofer. gVisor uses different strategies depending on the file type:
| FS Type | Location | Implementation |
|---|---|---|
| Root/Bind Mounts | Host Disk | Handled via the Gofer. |
tmpfs |
RAM | Exists entirely in the Sentry's memory. Never touches the disk or Gofer. |
/proc, /sys |
Virtual | Emulated by the Sentry code to look like Linux but showing sandbox data. |
/dev |
Virtual/Host | Some (like /dev/null) are emulated; others (like /dev/kvm) are "passed through." |
The Protocol: From 9P to LISAFS
For a long time, the Sentry and Gofer talked to each other using 9P (a protocol from the Plan 9 OS). However, 9P was slow for modern container workloads.
Today, gVisor uses LISAFS (Linux Shared File System).
- It is a highly optimized, gVisor-specific protocol.
- It uses Shared Memory for data transfers to avoid the overhead of copying bytes between the Sentry and Gofer.
Why do it this way? (Security Benefits)
- No Path-Traversal Attacks: The Sentry doesn't know the "real" path of a file on the host. It only knows names relative to what the Gofer provides. If an app tries to
open("../../../etc/shadow"), the Gofer simply sees a request for a file outside its root and denies it. - Double Seccomp:
- The Sentry is blocked from making
open()calls by a host-level Seccomp filter. - The Gofer is blocked from making
networkcalls. - Even if a hacker takes over the Sentry, they can't open new files. If they take over the Gofer, they can't send files over the network.
- The Sentry is blocked from making
The Performance Penalty
This architecture is why gVisor is slower at I/O than runc.
- In
runc,open()is 1 syscall. - In
runsc,open()involves an interception, a context switch to the Sentry, a message to the Gofer, a host syscall by the Gofer, and a message back to the Sentry.
Does gVisor have its own Global Open File Table?
Yes.
Inside the Sentry (the gVisor kernel), it maintains its own internal versions of the Linux kernel tables. It has to do this to ensure that the application running inside the sandbox behaves exactly as it would on a real Linux machine.
- Process FD Table: The Sentry manages a
FileTablefor every sandboxed process. - Global Open File Table: The Sentry maintains its own internal "Open File Description" objects (called
fs.Filein the Go source code). - Sharing: If a process inside gVisor calls
fork(), the Sentry ensures both internal processes point to the same internalfs.Fileobject. This way, gVisor manages the shared file offset entirely in user-space Go code, without asking the host kernel to do it.
Are Inodes in gVisor "real" host inodes?
No. gVisor uses Virtual Inodes.
The application inside the sandbox sees an inode number (e.g., when it calls stat), but that number is a synthetic ID generated by gVisor.
Why gVisor uses Virtual Inodes:
- Security: If gVisor passed real host inode numbers to the app, a malicious app might use those numbers to gain information about the host's layout or attempt "inode-based" attacks.
- Abstraction: gVisor often doesn't even have a direct connection to the host filesystem. It uses a separate process called the Gofer.
How File Access Actually Works (The Gofer)
To maintain a strong security boundary, the gVisor "Sentry" is not allowed to open files on the host directly. Instead, it talks to a process called the Gofer.
- App Request: App calls
open("/etc/passwd"). - Sentry Internal: The Sentry checks its internal virtual filesystem. It creates a virtual Inode and a virtual
struct file. - Gofer Request: The Sentry asks the Gofer (via a protocol like 9P or Lisafs) to open the file on the host.
- Host Reality: The Gofer process (running on the real host) calls a real
open()on the host kernel. - Host FD: The Gofer gets a real Host FD pointing to a real Host Inode.
- The Bridge: The Gofer (or the Sentry) holds that host FD. When the app inside the sandbox reads, the Sentry intercepts the call, performs its own internal logic (offset, checks), and then eventually performs a real read on the host FD.
Comparison: gVisor vs. Host
| Component | Inside gVisor (What App Sees) | Outside gVisor (What Host Sees) |
|---|---|---|
| FD Number | Managed by Sentry (e.g., FD 3). | Managed by Host (e.g., FD 125). |
| Open File Table | Internal Go structures in Sentry. | The real Global Table in the Host Kernel. |
| Inode Number | A "fake" ID generated by Sentry. | The real Inode number on the disk. |
| File Offset | Managed by gVisor's Go code. | Managed by the Host (for the Gofer's FD). |
The performance overhead
This "double-bookkeeping" is a large part of why gVisor has a performance overhead—every "virtual" file operation often results in an additional communication step between the Sentry, the Gofer, and the Host Kernel.