logo

Linux - LD_PRELOAD

LD_PRELOAD is an environment variable that tells the dynamic linker to load a specific, user-defined shared library before all other standard system libraries when a program is executed.

This allows the preloaded library to override any function from a standard library (like libc) with its own custom version. It's a mechanism that can be used for both incredibly powerful debugging and incredibly stealthy attacks.

LD in LD_PRELOAD primarily stands for Loader or Dynamic Linker.

How It Works: The Dynamic Linking Process

To understand LD_PRELOAD, you need to know a little about dynamic linking.

  1. Dynamic Linking: Most programs on Linux are not self-contained. When you run a program like ls, it needs functions from shared libraries to work (e.g., the printf function from libc.so to print text to the screen). The job of the dynamic linker (ld.so or ld-linux.so) is to find these required libraries and "link" them to the program at runtime.

  2. The LD_PRELOAD Instruction: When you set the LD_PRELOAD environment variable, you are giving a special instruction to the dynamic linker. The instruction is:

    "Hey, before you go and load the normal libraries like libc.so, I want you to first load this other library I'm specifying. Give it the highest priority."

  3. Function Overriding (Hooking): Let's say your preloaded library (my_hacks.so) contains your own version of the printf function.

    • When the ls program starts, the dynamic linker first loads my_hacks.so.
    • Then, it loads libc.so.
    • When the ls program tries to call printf, the linker looks for a function with that name. Because my_hacks.so was loaded first, the linker finds your version of printf and uses it. It never even gets to the original printf in libc.so.

You have successfully "hooked" the printf function.

Practical Use Cases: The Good, The Bad, and The Ugly

LD_PRELOAD is a double-edged sword.

The Good (Debugging, Development, and Performance)

  • Debugging: You can preload a library that wraps around a function to print its arguments and return value, helping you debug how a program is interacting with a library without modifying the program's source code.
  • Testing: You can override functions to simulate error conditions. For example, you could override the malloc function (which allocates memory) to make it occasionally fail, so you can test how your program handles out-of-memory errors.
  • Performance Monitoring: Tools like gperftools can use LD_PRELOAD to override memory allocation functions (malloc, free) to profile a program's memory usage.
  • Hardware Acceleration: You can preload libraries that replace certain CPU-based calculations with GPU-based ones.

Example (A simple "Hello World" hook):

  1. Create a C file my_printf.c:

    #include <stdio.h>
    
    int printf(const char *format, ...) {
        // First, print our custom message
        puts("=== Function 'printf' was hooked by LD_PRELOAD! ===");
        // We can't call printf() again here, or we'll get an infinite loop!
        // This is a simplified example.
        return 0; // Just return 0
    }
    
  2. Compile it into a shared library:

    gcc -shared -fPIC -o my_printf.so my_printf.c
    
  3. Run a command with LD_PRELOAD:

    LD_PRELOAD=./my_printf.so ls
    

    Instead of listing files, the ls command will just print "=== Function 'printf' was hooked by LD_PRELOAD! ===" because you've hijacked its ability to print.

The Bad and The Ugly (Malware and Rootkits)

This is where LD_PRELOAD becomes a powerful tool for attackers. This is the classic technique for implementing a user-land rootkit.

  • Hiding Files: An attacker can create a library that overrides the readdir function (which lists directory contents). Their custom readdir will call the original function to get the real list of files, but it will then filter out the names of the attacker's malicious files before returning the list to the program (ls, find, etc.).
  • Hiding Processes: They can override functions used by ps or top to hide their malicious processes.
  • Hiding Network Connections: They can override functions used by netstat to hide their connections to a command-and-control server.
  • Stealing Data: They can hook cryptographic functions (like SSL_write from OpenSSL) to intercept and steal data before it gets encrypted.

Security Note: Because of this potential for abuse, LD_PRELOAD is ignored by the dynamic linker if the program is running with elevated privileges (e.g., if it's a setuid binary), unless the preloaded library itself is also owned by root and located in a trusted directory. This is a crucial security mechanism to prevent easy privilege escalation.