简体   繁体   中英

Why NPTL threading in Linux still assignee unique PID to each thread?

I am reading pthread man and seeing following:

With NPTL, all of the threads in a process are placed in the same thread group; all members of a thread group share the same PID.

My current architecture is running on NPTL 2.17 and when I run htop that is showing threads I see that all PIDs are unique. But why? I am expecting some of them (eg chrome) sharing same PID with each other?

在此处输入图片说明

See man gettid :

gettid() returns the caller's thread ID (TID). In a single-threaded process, the thread ID is equal to the process ID (PID, as returned by getpid(2)). In a multithreaded process, all threads have the same PID, but each one has a unique TID. For further details, see the discussion of CLONE_THREAD in clone(2).

What htop shows is TID , not PID . You can toggle display of the threads on/off with H key.

You can also enable PPID column in htop and that shows the PID / TID of the main thread for threads.

Google's documentation for Chromium (which probably operates similarly to Chrome when it comes to these concepts) states that they use a "multi-process architecture". Your quote from pthread's man page states that all of the threads in a single process are placed under the same PID, which would not apply to Chrome's architecture.

The Linux kernel does have the concept of POSIX pids (explorable in /proc/* ) but it calls them thread group ids in the kernel source and it refers to its internal thread ids as pid s (explorable in /proc/*/task/* ).

I believe this is rooted in Linux's original treatment of threads as "just processes" that happen to share address spaces and a bunch of other stuff with each other.

Your user tool is likely propagating this perhaps confusing Linux kernel terminology.

Because kernel-level threads are no more than processes with the (nearly) same address space.

This was "solved" by the linux kernel development by renaming them the processes to "threads", the "pid"-s to "tid"-s, and the old processes became "thread groups".

However, the sad truth is that if you create the thread on Linux ( clone() ), it will create a process - only using the (nearly) same memory segments.

That means 1:1 thread model. It means that all the threads are actually kernel-level threads, meaning that they are essentially processes in the same address space.

Some other alternatives would be:

  • 1:M thread model. It means that the kernel doesn't know about threads, it is the task of the user-space libraries to make an "in-process multitasking" to run appearantly multi-threaded.
  • N:M thread model. This is best, unfortunately some opinion favorize still 1:1. It would mean that we have both user- and kernel-level threads and some optimization algorithm decides, what to run and where.

Once Linux had an N:M model (ngpt), but it was removed on a yet another fallback. It was that Linux kernel calls are inherently synchronous (blocking). Resulting that some kernel-cooperation had been needed even for user-space synchronization. Nobody wanted to do that.

So is it.

Ps to create a well-performant app, you should actually avoid to create a lot of threads at once. You need to use a thread pool with well-thought locking protocols. If you don't minimize the usage of the thread creations/joins, your app will be slow and ineffective, it doesn't matter if it is N:M or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM