简体   繁体   English

同时运行具有多个 Arguments 的函数并聚合复杂的结果

[英]Running Functions with Multiple Arguments Concurrently and Aggregating Complex Results

Set Up设置

This is part two of a question that I posted regarding accessing results from multiple processes.这是我发布的关于从多个进程访问结果的问题的第二部分。
For part one click Here: Link to Part One对于第一部分,请单击此处: 链接到第一部分

I have a complex set of data that I need to compare to various sets of constraints concurrently, but I'm running into multiple issues.我有一组复杂的数据,需要同时与多组约束进行比较,但我遇到了多个问题。 The first issue is getting results out of my multiple processes, and the second issue is making anything beyond an extremely simple function to run concurrently.第一个问题是从我的多个进程中获取结果,第二个问题是让超出极其简单的 function 的任何东西同时运行。

Example例子

I have multiple sets of constraints that I need to compare against some data and I would like to do this concurrently because I have a lot of sets of constrains.我有多组约束,需要与某些数据进行比较,我想同时执行此操作,因为我有很多约束集。 In this example I'll just be using two sets of constraints.在这个例子中,我将只使用两组约束。

Jupyter Notebook木星笔记本

Create Some Sample Constraints & Data创建一些示例约束和数据

# Create a set of constraints
constraints = pd.DataFrame([['2x2x2', 2,2,2],['5x5x5',5,5,5],['7x7x7',7,7,7]],
                     columns=['Name','First', 'Second', 'Third'])
constraints.set_index('Name', inplace=True)

# Create a second set of constraints
constraints2 = pd.DataFrame([['4x4x4', 4,4,4],['6x6x6',6,6,6],['7x7x7',7,7,7]],
                      columns=['Name','First', 'Second', 'Third'])
constraints2.set_index('Name', inplace=True)

# Create some sample data
items = pd.DataFrame([['a', 2,8,2],['b',5,3,5],['c',7,4,7]], columns=['Name','First', 'Second', 'Third'])
items.set_index('Name', inplace=True)

Running Sequentially顺序运行

If I run this sequentially I can get my desired results but with the data that I am actually dealing with it can take over 12 hours.如果我按顺序运行它,我可以获得我想要的结果,但是我实际处理的数据可能需要 12 个小时以上。 Here is what it would look like ran sequentially so that you know what my desired result would look like.这是按顺序运行的样子,以便您了解我想要的结果。

# Function
def seq_check_constraint(df_constraints_input, df_items_input):
    df_constraints = df_constraints_input.copy()
    df_items = df_items_input.copy()
    
    df_items['Product'] = df_items.product(axis=1)
    df_constraints['Product'] = df_constraints.product(axis=1)
    
    for constraint in df_constraints.index:
        df_items[constraint+'Product'] = df_constraints.loc[constraint,'Product']
        
    for constraint in df_constraints.index:
        for item in df_items.index:
                col_name = constraint+'_fits'
                df_items[col_name] = False
                df_items.loc[df_items['Product'] < df_items[constraint+'Product'], col_name] = True
    
    df_res = df_items.iloc[:: ,7:]
    return df_res
constraint_sets = [constraints, constraints2, ...]
results = {}
counter = 0

for df in constrain_sets:
    res = seq_check_constraint(df, items)
    results['constraints'+str(counter)] = res

or uglier:或更丑陋:

df_res1 = seq_check_constraint(constraints, items)
df_res2 = seq_check_constraint(constraints2, items)

results = {'constraints0':df_res1, 'constraints1': df_res2}

As a result of running these sequentially I end up with DataFrame 's like shown here:由于按顺序运行这些,我最终得到DataFrame ,如下所示: 这里

I'd ultimately like to end up with a dictionary or list of the DataFrame 's, or be able to append the DataFrame 's all together.我最终希望得到DataFrame的字典或列表,或者能够将 append 和DataFrame放在一起。 The order that I get the results doesn't matter to me, I just want to have them all together and need to be able to do further analysis on them.我得到结果的顺序对我来说并不重要,我只想把它们放在一起,并且需要能够对它们进行进一步的分析。

What I've Tried我试过的

So this brings me to my attempts at multiprocessing, From what I understand you can either use Queues or Managers to handle shared data and memory, but I haven't been able to get either to work.所以这让我想到了我在多处理方面的尝试,据我了解,您可以使用队列或管理器来处理共享数据和 memory,但我都无法正常工作。 I also am struggling to get my function which takes two arguments to execute within the Pool 's at all.我也在努力让我的 function 需要两个 arguments 在Pool中执行。

Here is my code as it stands right now using the same sample data from above:这是我现在的代码,使用上面的相同示例数据:

Function Function

def check_constraint(df_constraints_input, df_items_input):
    df_constraints = df_constraints_input.copy()
    df_items = df_items_input.copy()
    
    df_items['Product'] = df_items.product(axis=1)  # Mathematical Product
    df_constraints['Product'] = df_constraints.product(axis=1)
    
    for constraint in df_constraints.index:
        df_items[constraint+'Product'] = df_constraints.loc[constraint,'Product']
        
    for constraint in df_constraints.index:
        for item in df_items.index:
                col_name = constraint+'_fits'
                df_items[col_name] = False
                df_items.loc[df_items['Product'] < df_items[constraint+'Product'], col_name] = True
    
    df_res = df_items.iloc[:: ,7:]
    return df_res

Jupyter Notebook木星笔记本

df_manager = mp.Manager()
df_ns = df_manager.Namespace()
df_ns.constraint_sets = [constraints, constraints2]


print('---Starting pool---')

if __name__ == '__main__':
    with mp.Pool() as p:
        print('--In the pool--')
        res = p.map_async(mpf.check_constraint, (df_ns.constraint_sets, itertools.repeat(items)))
        print(res.get())

and my current error:和我当前的错误:

TypeError: check_constraint() missing 1 required positional argument: 'df_items_input'

Easiest way is to create a list of tuples (where one tuple represents one set of arguments to the function) and pass it to starmap .最简单的方法是创建一个元组列表(其中一个元组代表一组 arguments 给函数)并将其传递给starmap

df_manager = mp.Manager()
df_ns = df_manager.Namespace()
df_ns.constraint_sets = [constraints, constraints2]


print('---Starting pool---')

if __name__ == '__main__':
    with mp.Pool() as p:
        print('--In the pool--')
        check_constraint_args = []
        for constraint in constraint_sets:
            check_constraint_args.append((constraint, items))
        res = p.starmap(mpf.check_constraint, check_constraint_args)
        print(res.get())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM