logo

Linux - VFS (Virtual Filesystem)

In Linux, VFS (Virtual Filesystem) is the kernel software layer that acts as a "Universal Translator" between userspace applications and the actual data stored on a disk.

Because of VFS, your code doesn't have to care whether it's talking to an SSD (ext4), a network drive (NFS), a specialized database filesystem (XFS), or even a pseudo-filesystem like /proc.

The "Why": The Problem of Variety

Without VFS, if you wanted to write a simple C program to save a file, you would have to write different code for every single filesystem in existence.

  • write_to_ext4(file, data)
  • write_to_nfs(file, data)
  • write_to_xfs(file, data)

VFS solves this. It provides a standardized set of system calls (open, read, write, close) that work exactly the same way regardless of the underlying storage.

The VFS Architecture: The Four Primary Objects

VFS is an object-oriented layer (written in C) that uses four primary data structures to track everything. To truly understand Linux internals, you must know these four:

  1. The Superblock (struct super_block): Represents an entire mounted filesystem. It contains metadata like the total size of the disk, how much is free, and what the "type" is (ext4, etc.).
  2. The Inode (struct inode): Represents a specific file on the disk. Crucially, the inode contains all metadata about a file (size, permissions, timestamps, location of data blocks) except for the filename.
  3. The Dentry (struct dentry): Short for "Directory Entry." This object maps a filename to an Inode. VFS uses dentries to speed up lookups (via the "Dentry Cache"). When you look up /home/user/file.txt, VFS creates dentry objects for each part of the path.
  4. The File Object (struct file): Represents an open file in a specific process. This tracks things like the "current offset" (where you are currently reading in the file). This is created when a process calls open().

How it Works: The "Contract"

When a filesystem (like XFS) is written, it must "register" itself with the VFS. It does this by providing a table of function pointers called file_operations.

When you call read() in your C++ code:

  1. The kernel catches the system call and passes it to the VFS.
  2. VFS looks at the file object for your process.
  3. It sees that this file lives on an XFS partition.
  4. VFS looks at the XFS function table and calls xfs_file_read().

The VFS is essentially a giant switchboard.

Everything is a File

VFS is the reason for the famous Linux philosophy: "Everything is a file."

Because VFS provides a unified interface, the kernel can trick applications into treating things that aren't "files" as if they were:

  • Pipes: | connects the output of one VFS file object to the input of another.
  • Sockets: Network connections are accessed via VFS file descriptors.
  • Devices: Your hard drive is just a file at /dev/sda.
  • Process Info: Your CPU temperature and process list are just text files in /proc.

VFS and the Page Cache

The VFS layer also manages the Page Cache.

When you read a file from a slow HDD, VFS stores the data in "Pages" in your RAM. The next time you (or any other process) request that file, VFS intercepts the request and hands you the data from RAM instead of going back to the disk. This is why Linux performance improves the longer a system stays powered on.

How is it related to OverlayFS?

OverlayFS is a "stackable" filesystem that lives entirely within the VFS layer.

  • It doesn't talk to a disk directly.
  • Instead, it tells the VFS: "When someone asks for a file, check the upperdir dentry first. If it's not there, check the lowerdir dentry."

Summary

  • VFS is the abstraction layer between apps and disks.
  • It uses Superblocks, Inodes, Dentries, and File Objects.
  • It enables portability (one code for all disks) and the "Everything is a file" philosophy.
  • It is the "brain" that coordinates between your app, the Page Cache, and the actual filesystem driver (like XFS or ext4).