Linux - kprobe / kretprobe

TL;DR

If you want to know who is calling a function, use a kprobe. If you want to know what happened after it finished, use a kretprobe.

Concepts

Kprobes (Kernel Probes) and Kretprobes (Kernel Return Probes) are debugging and tracing mechanisms in the Linux kernel. They allow you to dynamically "break" into kernel execution to inspect data, monitor performance, or debug errors without needing to recompile the kernel or reboot the system.

1. Kprobe (Kernel Probe)

A kprobe is the standard probe that can be attached to virtually any instruction in the kernel. It is most commonly used to inspect the system state before an instruction executes (like checking function arguments).

How it works:
1. Registration: You register a probe at a specific kernel address (usually the start of a function).
2. Breakpoint: The kernel replaces the instruction at that address with a breakpoint instruction (e.g., int3 on x86).
3. Trap: When the CPU hits that address, it triggers a "trap" (exception).
4. Handler: The CPU pauses, saves the registers, and executes your custom "pre-handler" function.
5. Resume: The original instruction is executed, and the system continues as normal.
Best for:
- Checking input arguments (e.g., "Which file is being opened?").
- Tracing execution paths (e.g., "Did the code reach line 50?").

2. Kretprobe (Kernel Return Probe)

A kretprobe is a specialized probe designed to trigger when a function returns (finishes). It is used to inspect the output of a function.

How it works:
1. Entry: When the function is called, a standard kprobe at the entry fires.
2. Hijack: The kernel takes the "return address" (the location the function is supposed to go back to when it finishes) and saves it away. It replaces it with the address of a "trampoline" function.
3. Execution: The function runs normally.
4. Trampoline: When the function tries to return, it unknowingly jumps to the trampoline instead of the original caller.
5. Handler: The trampoline executes your "return-handler" (where you can see the return value).
6. Restore: The trampoline then jumps back to the real return address, and the system proceeds.
Best for:
- Checking return values (e.g., "Did the function return an error code?").
- Measuring function duration (Time at entry vs. Time at return).

Summary Comparison

Feature	Kprobe	Kretprobe
Where it triggers	Any instruction (usually the start).	When the function returns (exits).
Primary Goal	Inspect Arguments & Logic.	Inspect Return Values & Timing.
Mechanism	CPU Breakpoint (`int3`).	Return Address Hijacking (Trampoline).
Typical Use	"What is this function doing?"	"Did this function succeed or fail?"

Real-World Example

Imagine you want to trace the sys_open function (which opens files):

You use a Kprobe at the start of the function to read the registers and see the filename (e.g., "secret.txt").
You use a Kretprobe at the end of the function to read the return value (e.g., -1 for permission denied, or 3 for success).