[英]Multiple iterables as arguments in python multiprocessing
I have 3 dimensional dataset (100,64,3000)
, and i am finding features using multiprocessing. 我有3维数据集
(100,64,3000)
,我正在使用多处理功能。 I am doing multiprocessing across channel. 我正在跨通道进行多处理。 Such as each process cover 8 channels out of 64. Here is my code
例如每个进程覆盖64个通道中的8个。这是我的代码
import numpy as np
import time
from multiprocessing import Process,current_process,Pool
sub=1
def cal_feature(ch):
data=np.load('data_{}.npy'.format(sub))
return np.mean(data[:,ch:ch+8,:],-1)
# multiprocessing
if __name__ == '__main__':
start = time.time()
ch=[i for i in range(0,64,8)]
with Pool(8) as p:
result = p.map(cal_feature,(ch) )
print(time.time()-start)
You can create dummy data this way. 您可以通过这种方式创建虚拟数据。
import numpy as np
np.save('data_1', np.random.randint(0, 100, size=(100, 64, 3000)))
np.save('data_2', np.random.randint(0, 100, size=(100, 64, 3000)))
np.save('data_3', np.random.randint(0, 100, size=(100, 64, 3000)))
np.save('data_4', np.random.randint(0, 100, size=(100, 64, 3000)))
In my code i have to define which data has to be picked manually sub=1
. 在我的代码中,我必须定义必须手动提取哪些数据
sub=1
。 What I want to modify the above code such that it pick sub =1
and then find feature for all channels in a multiprocess way. 我想要修改上面的代码,以使其选择
sub =1
,然后以多进程方式查找所有通道的功能。 When its done it move to subject 2 and so on. 完成后,移至主题2,依此类推。
EDIT 编辑
ind_result=[result[i:i+8] for i in range(0,(len(sub)*8),8)]
for i,j in zip(sub,ind_result):
np.save('subject_0_{}'.format(i),np.concatenate((j),1) )
You're facing a common limitation of the multiprocessing
, that is that pool.map
only accepts one argument iterable. 您面临
multiprocessing
一个共同限制,那就是pool.map
仅接受一个可迭代的参数。
You can work around that by packing ch
and sub
into a tuple, and build the argument iterable with itertools.product
( reference here ). 您可以通过将
ch
和sub
打包到一个元组中来解决此问题,并使用itertools.product
构建可迭代的参数(请参阅此处 )。 You can then unpack the two arguments inside the cal_feature
function. 然后,您可以在
cal_feature
函数中解压缩两个参数。
import numpy as np
import time
from multiprocessing import Pool
from itertools import product
def cal_feature(param):
sub, ch = param
data=np.load('data_{}.npy'.format(sub))
return np.mean(data[:,ch:ch+8,:],-1)
# multiprocessing
if __name__ == '__main__':
start = time.time()
ch=[i for i in range(0,64,8)]
sub = [1, 2, 3, 4]
# here's the magic
param_list = product(sub, ch)
print list(param_list)
# [(1, 0), (1, 8), (1, 16), (1, 24), (1, 32), (1, 40), (1, 48),
# (1, 56), (2, 0), (2, 8), (2, 16), (2, 24), (2, 32), (2, 40),
# (2, 48), (2, 56), (3, 0), (3, 8), (3, 16), (3, 24), (3, 32),
# (3, 40), (3, 48), (3, 56), (4, 0), (4, 8), (4, 16), (4, 24),
# (4, 32), (4, 40), (4, 48), (4, 56)]
p = Pool(8)
result = p.map(cal_feature,param_list )
p.close()
print(time.time()-start)
# 0.0117809772491
There are some limitations for Pool
, I tried some methods, and recommend this way: Pool
有一些限制,我尝试了一些方法,并推荐这种方式:
from multiprocessing import Pool
from itertools import product
from functools import partial
def cal_feature(sub, ch):
return sub, ch
ch = [i for i in range(0, 16, 8)]
sub_list = [1, 2, 3]
def pool_helper(f, args):
return f(*args)
with Pool(8) as p:
result = p.map(partial(pool_helper, cal_feature), product(sub_list, ch))
print(result)
# output is [(1, 0), (1, 8), (2, 0), (2, 8), (3, 0), (3, 8)]
We don't need change original cal_feature
, and pool_helper
can be used for any function which accepts positional params. 我们不需要更改原始的
cal_feature
, pool_helper
可以用于任何接受位置参数的函数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.