简体繁体 English

跨多个内核的同一进程的多个线程

[英]Multiple threads of same process across mutiple cores

原文 2019-07-13 12:38:37 9 1 java/ multithreading/ multicore

Came to know that there is a possibility of multiple threads being executed on different cores of the same cpu. 知道有可能在同一CPU的不同内核上执行多个线程。 So will the definition of thread context switching still remains same? 那么线程上下文切换的定义是否仍将保持不变？ I mean will the address space still shared across threads in different cores. 我的意思是地址空间仍将在不同内核的线程之间共享。 Also will the synchronized block still remains safe from a thread running in a different core? 同步块是否仍然可以在不同内核中运行的线程中保持安全？

1 个解决方案

Yes, all modern CPUs implement Symmetric Multi Processing (SMP) systems. 是的，所有现代CPU都实现对称多处理（SMP）系统。 That is, all memory appears in the address space of all cores on all CPUs, and the same address on all cores refers to the same bytes in the memory chips. 也就是说，所有内存都出现在所有CPU上所有内核的地址空间中，并且所有内核上的相同地址是指内存芯片中的相同字节。

In hardware terms this is far from ideal. 在硬件方面，这远非理想。 Modern CPUs are not truly SMP, rather SMP is synthesized on top of a Non Uniform Memory Architecture (NUMA); 现代的CPU并不是真正的SMP，而是在非统一内存体系结构（NUMA）之上综合了SMP。 different memories sit on different address buses on different CPUs (or on some architectures, different cores). 不同的存储器位于不同的CPU（或某些体系结构，不同的内核）的不同地址总线上。 Thus a core on one CPU cannot directly address memory on another CPU. 因此，一个CPU上的内核无法直接寻址另一个CPU上的内存。

What Intel, AMD, etc have done is implemented very fast network-like communications channels between CPUs / cores. 英特尔，AMD等所做的事情是在CPU /内核之间实现非常类似网络的快速通信通道。 So when the core tries to access memory addresses that are not directly connected to it, it sends a request over the channel to the core which is connected to the right memory chips. 因此，当内核尝试访问未直接与其连接的内存地址时，它会通过通道向与正确的内存芯片连接的内核发送请求。 That core does the look up, sends the content back across the network. 该核心进行查找，然后将内容通过网络发送回去。 There's a whole lot of traffic like this going on all the time, keeping caches up to date, etc. 这样的流量一直持续不断，保持高速缓存为最新状态，等等。

This is not ideal as there's a fair chunk of silicon dedicated to running this network, transistors that might otherwise be used for implementing more cores instead. 这是不理想的，因为有相当一部分硅片专用于运行该网络，而晶体管本可以用于实现更多内核。

We only do this because, back in the day, the cheap way of getting multiple CPUs working was to stick 2 chips on the same memory address / data bus. 我们之所以这样做，是因为在过去，使多个CPU工作的便宜方法是将2个芯片粘贴在同一内存地址/数据总线上。 Cheap, because the hardware was cheap and multithreaded software (used to being context switched within a single CPU) didn't really notice the difference. 价格便宜，因为硬件价格便宜，而多线程软件（用于在单个CPU中进行上下文切换）并没有真正注意到差异。

To begin with this wasn't too bad - memory wasn't that slow in comparison to the CPUs. 首先，这还不错-内存与CPU相比并没有那么慢。 However it soon became unsustainable, but by then there was far too much software in the world expecting an SMP environment (small things like all major OSes). 但是，它很快变得不可持续，但是到那时，世界上已经有太多的软件期望SMP环境（像所有主要OS一样小的东西）。 So whilst the ideal hardware architectural shift back then would have been pure NUMA (and damn the software), the commercial reality was that SMP had to persist. 因此，尽管当时理想的硬件体系结构迁移应该是纯NUMA（和该死的软件），但商业现实是SMP必须坚持下去。 Hence the emergence of interconnects like Intel's QPI, AMD's Hypertransport, etc. 因此，出现了诸如英特尔的QPI，AMD的Hypertransport等互连。

The irony is that quite a few modern languages (Golang, Rust, etc) support message passing (CSP, Actor model) as part of the language. 具有讽刺意味的是，许多现代语言（Golang，Rust等）支持消息传递（CSP，Actor模型）作为语言的一部分。 This is exactly the programming paradigm that one would be forced to adopt if the underlying hardware was pure NUMA. 如果底层硬件是纯NUMA，这正是被迫采用的编程范例。 So there we have it; 因此，我们有它; message passing paradigms suitable for NUMA machines being implemented by trendy languages on top of SMP architectures, which in turn are synthesized on top of actual NUMA hardware. 消息传递范例适用于在SMP架构之上由流行语言实现的NUMA机器，而SMP架构又是在实际NUMA硬件之上进行综合的。 If you think that's crazy, you're not alone. 如果您认为这很疯狂，那么您并不孤单。

You can think of QPI, Hypertransport, etc as being a bit like Ethernet attached memory, with CPUs or Cores acting as memory servers for other CPUs and Cores. 您可以将QPI，Hypertransport等视为有点像以太网连接的内存，而CPU或Core充当其他CPU和Core的内存服务器。 Only QPI, Hypertransport are a lot quicker, and it's all hidden away from the software running on the CPUs. 只有QPI，Hypertransport更快，而且所有这些都与运行在CPU上的软件隐藏在一起。