Linux - Linker
What is a Linker
A Linker is a development tool that performs the final step in creating an executable program or a library. It takes one or more "object files" (generated by a compiler) and combines them into a single file that the operating system can actually run.
What does a Linker actually do?
Imagine you are writing a large program. You don't put all 100,000 lines of code in one file; you split them into multiple files (e.g., main.c, math_utils.c, network.c).
- The Compiler translates each
.cfile into an Object File (.oor.obj). These files contain machine code, but they are "incomplete." For example,main.omight call a function calledcalculate_result(), but that function's code is actually insidemath_utils.o. - The Linker steps in to "connect the dots." Its main jobs are:
A. Symbol Resolution
The linker looks at all the object files. It sees that main.o is looking for calculate_result. It searches the other files, finds that function in math_utils.o, and creates a link between the two. If it can't find it, you get the famous "Undefined Reference" error.
B. Relocation
Compilers usually assume the code starts at address zero. But you can't have five different files all starting at address zero in the same program. The linker "shuffles" the code, assigning each section a unique, non-overlapping memory address. It then updates all the jump and call instructions in the code to point to these new, correct addresses.
C. Section Merging
The linker takes all the .text (code) sections from all files and glues them into one big .text block. It does the same for data and other sections.
Static vs. Dynamic Linking
There are two ways a linker can work:
- Static Linking: The linker copies all the library code (like
printffrom the standard library) directly into your executable.- Result: A larger file, but it has no dependencies. It can run on its own.
- Dynamic Linking: The linker doesn't copy the code. Instead, it leaves a "note" in the executable saying: "When this program starts, please find
libc.soand look forprintfthere."- Result: A smaller executable, but it requires the library to be present on the system. (This is how most modern software and the vDSO work).
Is it specific to Linux?
No. While the concept is the same everywhere, the File Formats and the Tool Names differ:
| OS | Linker Name | Binary Format | Library Extension |
|---|---|---|---|
| Linux | ld (GNU) / lld (LLVM) |
ELF | .so (Shared Object) |
| Windows | link.exe (MSVC) |
PE (Portable Executable) | .dll (Dynamic Link Library) |
| macOS | ld (Apple) |
Mach-O | .dylib (Dynamic Library) |
What is the .lds file?
An .lds file is a Linker Script.
It is a configuration file used by the Linker (the ld tool in the GCC suite or lld in LLVM) to determine exactly how the final executable or shared library should be laid out in memory.
While standard applications usually use a "default" linker script provided by the system, low-level software like the Linux Kernel, bootloaders, and the vDSO require custom .lds files to control their internal structure with extreme precision.
What does a Linker Script do?
When you compile code, the compiler creates many separate "object files" (.o). These files contain different sections:
.text: The actual machine code instructions..data: Global variables that can change..rodata: Constant data (like strings)..bss: Variables initialized to zero.
The Linker Script tells the linker:
- Memory Layout: At what virtual memory address should the code start? (e.g.,
0xffffffff81000000). - Section Mapping: Which sections from which object files should be grouped together in the final binary?
- Alignment: Should sections be aligned to 4KB pages or 16-byte boundaries?
- Symbol Definition: It can "invent" variables. For example, it can define a symbol
__bss_startat the beginning of the BSS section so the code knows where to start clearing memory.
A Simple Example
A simplified .lds file might look like this:
SECTIONS
{
. = 0x10000; /* Start at address 0x10000 */
.text : { *(.text) } /* Put all code here */
. = 0x8000000; /* Move to a new address */
.data : { *(.data) } /* Put all data here */
}
Why the .S extension? (vdso.lds.S)
Often, you will see the extension .lds.S. The .S means it is a "Linker Script with Preprocessor directives." This allows developers to use C-style macros (like #define or #ifdef) inside the linker script. This is common when the same script needs to support different CPU architectures (like x86 vs. ARM).
Summary
- What: A configuration file for the linker.
- Purpose: Defines the memory map and section organization of a binary.
- Who uses it: Kernel developers, gVisor engineers, and embedded systems programmers.
- Context: It is the "blueprint" used to build the vDSO and the Sentry.