Linux - Tracing tools
The naming convention in Linux can be confusing because many tools end in "trace," but they operate at different levels of the system and use completely different "under-the-hood" technologies.
Tracing tools vs Tracers
In a broad sense, a Tracing Tool is the user-space software you interact with to collect data.
- Examples:
ftrace,perf,strace,bpftrace,LTTng. - These tools provide the commands, the filters, and the output formatting that you see on your screen.
A Tracer is a specific plugin or engine inside the kernel that implements a particular type of data collection logic.
Think of ftrace as a Swiss Army Knife and a "tracer" as one of the individual blades you can fold out.
If you look at /sys/kernel/debug/tracing/available_tracers, you will see a list:
sudo cat /sys/kernel/tracing/available_tracers
blk mmiotrace function_graph function nop
Each of these is a "tracer":
function: Traces every single function call in the kernel.function_graph: Traces function calls but also draws a "tree" showing which function called which, and how long they took (the "graph").blk: Traces block I/O (disk) events.hwlat: Specifically looks for hardware latency issues.nop: The default "no-operation" tracer (tracing is off).
The /sys/kernel/debug/tracing directory is the official interface for the ftrace framework. In modern Linux kernels, that folder is a special virtual filesystem called tracefs.
How do they relate?
The relationship is hierarchical: Tool -> Framework -> Tracer -> Hook.
| Layer | Component | Description |
|---|---|---|
| Tool (User Interface) | trace-cmd or perf |
The command line tool you run. |
| Framework (Infrastructure) | ftrace |
The kernel subsystem that manages buffers and control files. |
| Tracer (The Logic) | function_graph |
The specific "engine" currently active inside ftrace. |
| Hook (The Source) | Tracepoints or Kprobes |
The actual point in the code where the data is grabbed. |
ftrace vs. ptrace (The "Engine" Difference)
This is the most important distinction. They share a similar name but have almost nothing in common in terms of implementation.
-
ptrace (Process Trace):
- What it is: A System Call (
man ptrace). - How it works: It allows one process (the "tracer") to control and observe another process (the "tracee"). It can pause the process, inspect its memory, and look at its registers.
- Cost: Very High. Every time the tracee does something, the kernel has to pause it and context-switch to the tracer. This makes the program run significantly slower.
- Use case: Debuggers (like GDB) and tracers (like strace).
- What it is: A System Call (
-
ftrace (Function Trace):
- What it is: A Kernel Framework.
- How it works: It is built directly into the kernel's code. It uses binary patching (replacing NOPs with jumps) to record data without stopping the execution of the system.
- Cost: Very Low. It is designed to be used in production environments with minimal impact.
- Use case: Analyzing kernel latency, scheduling issues, and driver behavior.
ftrace vs. strace (System Call Tracing)
-
strace (System Call Trace):
- Level: User-space interface.
- Function: It tells you what an application is asking the kernel to do (e.g., "Open this file," "Write to this socket").
- Mechanism: It is a wrapper around ptrace.
- Analogy: If the Kernel is a restaurant,
stracelists the orders the customers (apps) give to the waiters.
-
ftrace:
- Level: Kernel-space internal.
- Function: It tells you what happens after the request is made. If
straceshowsopen(),ftraceshows the 50 internal kernel functions that actually find the file on the hard drive. - Analogy:
ftraceis the camera in the kitchen showing exactly how the chef is cooking the meal.
ltrace (Library Trace)
- ltrace:
- Function: It traces calls to shared libraries (like
libc.so). It shows when an app callsprintf()ormalloc(). - Mechanism: It also uses
ptrace(usually) or hooks into the PLT (Procedure Linkage Table). - Relationship to ftrace: None.
ltracestays in user-space;ftracestays in kernel-space.
- Function: It traces calls to shared libraries (like
Where do eBPF and bpftrace fit in?
In modern Linux (post-2015), eBPF (and its tool bpftrace) has become the "big brother" to ftrace.
- ftrace is like a fixed-lens camera: It’s very good at showing you the functions and the flow, but it's hard to do complex math or logic on the fly.
- eBPF is like a programmable smart camera: You can write small programs to "only trace this function if the file being opened is
/etc/shadowand the user is not root."
Relationship: eBPF and ftrace actually share some underlying infrastructure (like "tracepoints" and "kprobes"), but eBPF is more powerful (and more complex) while ftrace remains the quickest, easiest way to debug the kernel without writing code.
Summary Table
| Tool | Focus | Level | Mechanism | Overhead |
|---|---|---|---|---|
| ptrace | Controlling a process | System Call | Context switching | High |
| strace | App Kernel calls | User-space | ptrace | High |
| ltrace | App Library calls | User-space | ptrace / Breakpoints | High |
| ftrace | Kernel Kernel calls | Kernel-space | Binary patching (NOPs) | Low |
| blktrace | Disk I/O (Blocks) | Kernel/Hardware | Tracepoints | Low |