PID vs TGID
The distinction between PID (Process ID) and TGID (Thread Group ID) is the source of a lot of confusion because the Linux Kernel and User-Space (the terminal) use these terms differently.
The Kernel's Perspective
In the Linux kernel, there is no "special" thing called a thread. A thread is just another task_struct that happens to share memory and files with another task. Therefore:
- PID (Process ID): In the kernel, every single thread has its own unique PID. If you have a process with 5 threads, the kernel sees 5 different PIDs.
- TGID (Thread Group ID): This is the PID of the "Thread Group Leader" (the first thread created). All threads in the same process share the same TGID.
The User-Space Perspective (What you see)
POSIX (the standard for Unix-like systems) requires that all threads in a single process share the same Process ID. To satisfy this, Linux "lies" to you in the terminal:
- What you call "PID" in
ps,top, orhtopis actually the TGID from the kernel's perspective. - What you call "TID" (Thread ID) is actually the PID from the kernel's perspective.
A Practical Example
Imagine you run a program called MyBrowser. It starts up and then creates two additional threads.
| Thread | Kernel PID | Kernel TGID | User-Space Name |
|---|---|---|---|
| Main Thread (Leader) | 1000 | 1000 | "The Process" (PID 1000) |
| Worker Thread 1 | 1001 | 1000 | "Thread 1" (TID 1001) |
| Worker Thread 2 | 1002 | 1000 | "Thread 2" (TID 1002) |
- If you run
kill 1000, the signal is sent to the TGID, and the kernel knows to shut down all tasks belonging to that group. - If you look at
/proc/1000/task/, you will see folders named1000,1001, and1002.
How to see them in C code
When writing C code, the functions you use return different values based on this logic:
getpid(): Returns the TGID. (It wants to tell you the ID of the "process" as a whole).gettid(): Returns the kernel PID. (It tells you the unique ID of the specific thread calling it).
Why does this matter?
This architecture is what makes Linux "Lightweight Process" (LWP) design so powerful.
- Scheduling: The scheduler doesn't care about TGIDs. It only looks at PIDs (tasks). It treats every thread as an independent entity to be scheduled on any available CPU core.
- Signals: When you send a signal (like
SIGSTOP) to a "Process ID" (TGID), the kernel looks at the thread group and applies that signal to everytask_structsharing that TGID. - Exit: If one thread calls the
exit()system call, the kernel uses the TGID to find all "sibling" threads and kills them too, ensuring the whole process exits together.
Summary Comparison Table
| Term | Kernel Meaning | User-Space Meaning | Shared by threads? |
|---|---|---|---|
| PID | The unique ID of a task_struct |
The Thread ID (TID) | No (Each thread has its own) |
| TGID | The ID of the group leader | The Process ID (PID) | Yes (All threads share one) |
The "Golden Rule": If you are inside the kernel, PID means "the specific thread." If you are at a bash prompt, PID means "the whole thread group."
A PID is actually a complex structure
In user's view, a PID is just a 32-bit integer (technically a pid_t in C).
However, in Linux kernel's view, it is a complex object called struct pid.
This change was a major architectural shift required to support Containers (Namespaces).
Here is why the kernel moved away from simple integers and how struct pid works.
1. The Problem: PID Namespaces
Before containers, a PID was just a global integer. But in modern Linux, we have PID Namespaces.
Imagine you are running a Docker container:
- Inside the container, a process thinks its PID is 1.
- On the host machine, that same process might actually be PID 5678.
A single process now has multiple ID numbers depending on who is looking at it. A simple integer field in the task_struct can no longer hold all these different values.
2. The Solution: struct pid
Instead of storing an integer, the task_struct stores a pointer to a struct pid. This structure acts as a "hub" that connects the process to its various ID numbers across different namespaces.
Here is a simplified view of what's inside struct pid:
struct pid {
refcount_t count; // Reference counter
unsigned int level; // Number of namespaces this PID exists in
struct hlist_head tasks[PIDTYPE_MAX]; // List of tasks using this PID (Threads/Groups)
struct upid numbers[1]; // THE MAGIC: Array of IDs (one for each namespace)
};
3. The Magic: struct upid
The numbers[] array at the end of struct pid contains struct upid objects. Each upid represents the specific "view" of that process in a specific namespace level:
struct upid {
int nr; // The actual integer ID (e.g., 1 or 5678)
struct pid_namespace *ns; // Which namespace does this number belong to?
};
Example Scenario:
If you have a process inside a nested container (Level 2), its struct pid will have an array of 3 upids:
- Level 0 (Host):
nr = 5678,ns = Global - Level 1 (Container):
nr = 42,ns = Container1 - Level 2 (Nested):
nr = 1,ns = NestedChild
4. Why is it an "Object" and not just a table?
By making the PID a standalone structure, the kernel gains two massive advantages:
A. Decoupling Life Cycles
In the old days, if a process died, its PID was immediately available for reuse. This caused "Race Conditions"—you might try to kill PID 100, but in the millisecond before your command hit, the old PID 100 died and a new, innocent process was assigned PID 100.
Now, struct pid is reference-counted. Even if the process (task_struct) dies, the struct pid can stay alive as long as something else (like a /proc file handle) is still referring to it. This prevents accidental PID reuse.
B. Fast Lookups
The kernel uses struct pid to quickly find a task_struct. It maintains a hash table where the key is the PID integer + the Namespace. This allows the kernel to instantly resolve "Who is PID 1 in Namespace X?"
5. How task_struct points to it
In the task_struct, you won't find int pid. Instead, you find:
struct task_struct {
...
struct pid *thread_pid;
...
}
When you call getpid() in C, the kernel performs a "translation":
- It looks at the
currenttask'sthread_pidpointer. - It looks at the
currenttask'snamespace. - It walks the
numbers[]array in thestruct pidto find thenrthat matches the current namespace. - It returns that integer to you.
Example: In a container
Inside a container, a process might think it is PID 1.
The "Real" host kernel might see that same process as PID 4502.
The task_struct doesn't just store 4502; it points to a struct pid that keeps track of all the different numbers that process "owns" in different namespaces.
Summary
- The Integer (PID): Is just a "label" valid only within a specific namespace.
- The Structure (
struct pid): Is the "source of truth" that lives in kernel memory and maps all those different labels back to the actual process.
This complexity is what allows your Linux machine to run hundreds of isolated containers, each thinking they are the "owner" of PID 1, without crashing the host system.