简体   繁体   English

multiprocessing.pool代码卡住,无法完成运行

[英]multiprocessing.pool code stuck and does not finish running

I am trying to use the pool class in multiprocessing module in python to do some data wrangling over a pandas data frame in parallel (code mentioned under 'Main code' heading below). 我正在尝试在python的多处理模块中使用pool类来并行处理一些在pandas数据框架上的数据(下面的“主代码”标题下提到的代码)。 The problem is my code gets stuck and does not finish running however small an input data frame (even as small as 10 rows) I provide to it. 问题是我的代码卡住了,并且无论我提供的输入数据帧有多小(甚至小到10行),它也无法完成运行。 I also tried to run a simple example code (code mentioned under 'Pool example' heading below) and even that doesn't run. 我还尝试运行一个简单的示例代码(下面的“池示例”标题下提到的代码),即使这样也无法运行。

Here is a detailed description of what i am trying to do in the code below: I have an indices dataframe which has 10 columns and 650K rows. 这是我要在下面的代码中进行的操作的详细说明:我有一个索引数据框,其中包含10列和650K行。 The idea is to take the 10 values in each row of indices dataframe and for rows with those indexes from a target dataframe 'traindat', take a mean of a few of its columns . 这个想法是在索引数据帧的每一行中获取10个值,对于具有来自目标数据帧“ traindat”的那些索引的行,取其几列的平均值。 I have to do this for all rows of indices dataframe (650K). 我必须对索引数据帧(650K)的所有行执行此操作。

Main code: 主要代码:

from multiprocessing import Pool
def func(x,i):
    dftmp=traindat.iloc[x,4:28].mean()
    return pd.DataFrame(dftmp).transpose()

pool = mp.Pool(processes=3)
new_rows = pool.map(func, [(row,idx) for idx,row in indices.iterrows()])
pool.close()
pool.join()
data_all_new = pd.concat(new_rows)

Since this code wouldn't run, I also tried the following simple code to see if pool runs at all for me. 由于此代码无法运行,因此我还尝试了以下简单代码来查看pool是否完全为我运行。 And it doesn't. 事实并非如此。 Pool example: 池示例:

import sys
sys.modules['__main__'].__file__ = 'ipython'
from multiprocessing import Pool
def f(x):
    return x*x

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [1, 2, 3]))

I don't get any errors in my code. 我的代码没有任何错误。 It simply gets stuck and doesn't finish running. 它只是卡住而不能完成运行。 Please help me if you understand this issue. 如果您了解此问题,请帮助我。

Edit: I later realized the issue only happens in Windows. 编辑:我后来意识到该问题仅在Windows中发生。 So editing the question to include that. 因此,编辑问题以使其包含在内。

I realized this is a duplicate question late with the help of a colleague. 我意识到在同事的帮助下,这是一个重复的问题。 Posting link to the original question and answer in case someone stumbles upon this: Basic parallel python program freezes on Windows 发布链接到原始问题和答案,以防万一有人偶然发现此问题: Windows上的基本并行python程序冻结

Seems like this is an issue related to IDE not configured properly. 似乎这是与未正确配置IDE有关的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM