简体   繁体   English

python多处理的硬件

[英]Hardware for python multiprocessing

I have a task where I need to run the same function on many different pandas dataframes.我有一个任务,我需要在许多不同的 Pandas 数据帧上运行相同的函数。 I load all the dataframes into a list then pass it to Pool.map using the multiprocessing module.我将所有数据帧加载到一个列表中,然后使用multiprocessing模块将其传递给Pool.map The function code itself has been vectorized as much as possible, contains a few if/else clauses and no matrix operations.函数代码本身已尽可能向量化,包含一些 if/else 子句,没有矩阵运算。

I'm currently using a 10-core xeon and would like to speed things up, ideally passing from Pool(10) to Pool(xxx) .我目前正在使用 10 核至强,并希望加快速度,理想情况下从Pool(10)传递到Pool(xxx) I see two possibilities:我看到两种可能性:

  • GPU processing. GPU处理。 From what I have read though I'm not sure if I can achieve what I want and would in any case need lots of code modification.从我读过的内容来看,虽然我不确定我是否可以实现我想要的,并且在任何情况下都需要大量的代码修改。

  • Xeon-Phi.至强-Phi。 I know it's being discontinued, but supposedly code adaptation is easier and if thats really the case I'd happily get one.我知道它已经停产了,但据说代码改编更容易,如果真是这样,我很乐意得到一个。

Which path should I concentrate on?我应该专注于哪条路? Any other alternatives?还有其他选择吗?

Software: Ubuntu 18.04, Python 3.7.软件:Ubuntu 18.04,Python 3.7。 Hardware: X99 chipset, 10-core xeon (no HT)硬件:X99 芯片组,10 核至强(无 HT)

Took a while, but after changing it all to numpy and achieving a little more vectorization I managed to get a speed increase of over 20x - so thanks Paul.花了一段时间,但在将其全部更改为 numpy 并实现更多矢量化后,我设法将速度提高了 20 倍以上 - 所以感谢 Paul。 max9111 thanks too, I'll have a look into numba. max9111 也谢谢,我去看看 numba。

You can rely on new Intel 2066 platform or Xeon.您可以信赖新的英特尔 2066 平台或至强。 With newest AVX512 they accelerated numpy processing a lot (numpy is the base of pandas).使用最新的 AVX512,他们大大加速了 numpy 处理(numpy 是 Pandas 的基础)。 Check: https://software.intel.com/en-us/articles/the-inside-scoop-on-how-we-accelerated-numpy-umath-functions检查: https : //software.intel.com/en-us/articles/the-inside-scoop-on-how-we-accelerated-numpy-umath-functions

First of all, try to switch to numpy-based calculations (even with simple .values over the series), it can improve the processing speed up to 10x首先,尝试切换到基于 numpy 的计算(即使使用简单的 .values 超过系列),它可以将处理速度提高 10 倍

You can also try to get 2 CPU motherboard and get more parallelization for calculation.您也可以尝试获得2个CPU主板并获得更多并行化进行计算。

In the most situations, the bottleneck is not the processing of the data, but IO operations - reading from drive to memory.在大多数情况下,瓶颈不是数据的处理,而是 IO 操作——从驱动器读取到内存。 This will be the problem using GPU too.这也是使用 GPU 的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM