简体繁体中英

Cost of context switch between threads of same process, on Linux

原文 2011-05-11 03:36:06 2 1 c/ linux/ pthreads/ posix/ context-switch

Is there any good empirical data on the cost of context switching between threads of the same process on Linux (x86 and x86_64, mainly, are of interest)? I'm talking about the number of cycles or nanoseconds between the last instruction one thread executes in userspace before getting put to sleep voluntarily or involuntarily, and the first instruction a different thread of the same process executes after waking up on the same cpu/core.

I wrote a quick test program that constantly performs rdtsc in 2 threads assigned to the same cpu/core, stores the result in a volatile variable, and compares to its sister thread's corresponding volatile variable. The first time it detects a change in the sister thread's value, it prints the difference, then goes back to looping. I'm getting minimum/median counts of about 8900/9600 cycles this way on an Atom D510 cpu. Does this procedure seem reasonable, and do the numbers seem believable?

My goal is to estimate whether, on modern systems, thread-per-connection server model could be competitive with or even outperform select-type multiplexing. This seems plausible in theory, as the transition from performing IO on fd X to fd Y involves merely going to sleep in one thread and waking up in another, rather than multiple syscalls, but it's dependent on the overhead of context switching.

1 answers

(Disclaimer: This isn't a direct answer to the question, it's just some suggestions that I hope will be helpful).

Firstly, the numbers you're getting certainly sound like they're within the ballpark. Note, however, that the interrupt / trap latency can vary a lot among different CPU models implementing the same ISA. It's also a different story if your threads have used floating point or vector operations, because if they haven't the kernel avoids saving/restoring the floating point or vector unit state.

You should be able to get more accurate numbers by using the kernel tracing infrastructure - perf sched in particular is designed to measure and analyse scheduler latency.

If your goal is to model thread-per-connection servers, then you probably shouldn't be measuring involuntary context switch latency - usually in such a server, the majority of context switches will be voluntary, as a thread blocks in read() waiting for more data from the network. Therefore, a better testbed might involve measuring the latency from one thread blocking in a read() to another being woken up from the same.

Note that in a well-written multiplexing server under heavy load, the transition from fd X to fd Y will often involve the same single system call (as the server iterates over a list of active file descriptors returned from a single epoll() ). One thread also ought to have less cache footprint than multiple threads, simply through having only one stack. I suspect the only way to settle the matter (for some definition of "settle") might be to have a benchmark shootout...

Cost of using select() in different threads for the same process

context switch between two processes in linux

Reduce Context Switches Between Threads With Same Priority

context switch in Linux

Measuring a process context switch

How can I context switch between child and parent process continuosly?

How Stack or memory is allocated for threads under the same process in Linux

Measuring context switch time for threads

Switch between threads in C

Sending UDP messages between two threads in the same C program (Linux)

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Cost of using select() in different threads for the same process context switch between two processes in linux Reduce Context Switches Between Threads With Same Priority context switch in Linux Measuring a process context switch How can I context switch between child and parent process continuosly? How Stack or memory is allocated for threads under the same process in Linux Measuring context switch time for threads Switch between threads in C Sending UDP messages between two threads in the same C program (Linux)

Related Tags

Cost of context switch between threads of same process, on Linux

Question

1 answers

solution1 16 ACCPTED 2011-05-11 04:17:19

solution1
16 ACCPTED 2011-05-11 04:17:19