简体   繁体   English

使用多处理池时,工作进程数应与CPU或内核数相同吗?

[英]When using multiprocessing Pool should the number of worker processes be the same as the number of CPU's or cores?

When using Python multiprocessing Pool should the number of worker processes be the same as the number of CPU's or cores? 使用Python多处理池时,辅助进程的数量应该与CPU或内核的数量相同吗?

This article http://www.howtogeek.com/194756/cpu-basics-multiple-cpus-cores-and-hyper-threading-explained/ says each core is really a central processing unit on a CPU chip. 本文http://www.howtogeek.com/194756/cpu-basics-multiple-cpus-cores-and-hyper-threading-explained/说,每个内核实际上是CPU芯片上的中央处理单元。 And thus it seems like there should not be a problem with having 1 process/core 因此,似乎有1个进程/核心应该没有问题

eg If I have a single CPU chip with 4 cores, can 1 process/core for a total of 4 processes be ran without the possibility of slowing performance. 例如,如果我有一个带有4个内核的CPU芯片,则可以运行1个进程/每个内核总共4个进程而不会降低性能。

From what I've learned regarding python and multiprocessing, the best course of action is... 从我所学到的有关python和多处理的知识来看,最好的做法是...

  • One process per core, but skip logical ones. 每个内核一个进程,但是跳过逻辑进程。

Hyperthreading is no help for python. 超线程对python没有帮助。 It'll actually hurt performance in many cases, but test it yourself first of course. 在许多情况下,它实际上会损害性能,但是请您首先对其进行测试。

  • Use the affinity (pip install affinity) module to stick each process to a specific core. 使用亲和力(pip install亲和力)模块将每个进程绑定到特定核心。

At least tested extensively on windows using 32bit python, not doing this will hurt performance significantly due to constant trashing of the cache. 至少在使用32位python的Windows上进行了广泛的测试,由于不断地破坏缓存,不这样做会严重损害性能。 And again: skip logical cores! 再说一次:跳过逻辑核心! Logical ones, assuming you have an intel cpu with hyperthreading, are 1,3,5,7, etc. 假设您有一个具有超线程功能的Intel CPU,则逻辑值为1,3,5,7,依此类推。

More threads than real cores will help you nothing, unless there's also IO happening, which it shouldn't if you're crunching numbers. 线程数多于实际内核将无济于事,除非同时发生IO,如果您正在处理数字,也不应这样做。 Test my claim yourself, especially if you use Linux, as I didn't get to test in Linux at all. 自己测试我的主张,尤其是在使用Linux的情况下,因为我根本没有在Linux上进行测试。

It really depends on your workload. 这实际上取决于您的工作量。 Case by case, the best approach is to run some benchmark test and see what is the result. 视情况而定,最好的方法是运行一些基准测试并查看结果。

Scheduling processes is an expensive operation, the more running processes, the more you need to change context. 计划进程是一项昂贵的操作,进程越多,您需要更改上下文的时间就越多。

If most of your processes are not running (they are waiting for IO for example) then overcommitting might prove beneficial. 如果大多数进程未运行(例如,它们正在等待IO),则过量使用可能会被证明是有益的。 On the opposite, if your processes are running most of the time, adding more of them contending your CPU is going to be detrimental. 相反,如果您的进程大部分时间都在运行,则添加更多争用CPU的进程将是有害的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM