简体   繁体   English

Python:同时运行多个功能

[英]Python: running multiple functions concurrently

I am using Pycharm for running my Python code.我正在使用 Pycharm 来运行我的 Python 代码。 I have a function that combines multiple excel files in a folder and write to a csv file.我有一个 function 将多个 excel 文件组合在一个文件夹中并写入一个 csv 文件。 I have 5 folders which I want to create 5 csv files for.我有 5 个文件夹,我想为其创建 5 个 csv 文件。 Right now, I am running this sequentially, ie, one folder after a folder.现在,我按顺序运行,即一个文件夹接一个文件夹。 This takes quite a long time.这需要相当长的时间。 I have another option:running the same code, but using 5 different Pycharm projects.我还有另一个选择:运行相同的代码,但使用 5 个不同的 Pycharm 项目。 This works.这行得通。 But I am wondering if there is a way to run this function 5 times concurrently in 1 single project?但我想知道是否有办法在 1 个项目中同时运行这个 function 5 次? My pseudo code is:我的伪代码是:

combine("folder1", "csvfile1")
combine("folder2", "csvfile2")
combine("folder3", "csvfile3")
combine("folder4", "csvfile4")
combine("folder5", "csvfile5")

Try using multiprocessing to map the function combine to separate cores and run it asynchronously.尝试使用multiprocessing将 map 和 function combine分离内核并异步运行。 Here is an example -这是一个例子 -

#!pip install multiprocessing
import multiprocessing as mp

fo = ["folder1","folder2","folder3","folder4"]
fi = ["csvfile1","csvfile2","csvfile3","csvfile4"]

def combine(a,b):
    #YOUR CODE HERE
    print("Completed: ",a,'->',b)

pool = mp.Pool(processes=4) #Set this to the max number of cores you have
results = [pool.apply_async(combine, args=(x)) for x in zip(fo,fi)]

Completed:  folder2 -> csvfile2
Completed:  folder1 -> csvfile1
Completed:  folder3 -> csvfile3
Completed:  folder4 -> csvfile4

Each iteration of function is run asynchronously and in parallel for better utilization fo your resources. function 的每次迭代都是异步和并行运行的,以便更好地利用您的资源。

在此处输入图像描述


As Furas pointed out, you can now use starmap (and its async version) since multiprocessing now supports it ( added in version 3.3 ).正如Furas指出的那样,您现在可以使用starmap (及其异步版本),因为multiprocessing现在支持它( 在 3.3 版中添加)。 This helps mapping a tuple of params to a function directly instead of the iterating and applying over zip .这有助于将参数元组直接映射到 function 而不是迭代和应用zip

results = pool.starmap_async(combine, zip(fo,fi)) #async version
results = pool.starmap(combine, zip(fo,fi)) #sync version

If you have a return as part of your function and you want to retrieve those values, for the synchronous version you can simply get it from result but for the asynchronous version, you will need a result.get()如果你有一个return作为你的 function 的一部分并且你想要检索这些值,对于同步版本你可以简单地从result中获取它但对于异步版本,你将需要一个result.get()

Akshay Sehgal already explain you how to use it. Akshay Sehgal 已经向您解释了如何使用它。 I add only few information.我只添加了很少的信息。

You can write it shorter using map() (for single argument) or starmap() (for many arguments)您可以使用map() (对于单个参数)或starmap() (对于许多参数)将其写得更短

 results = pool.starmap_async(combine, zip(fo,fi))

if you use async version then you may need .get() to wait for all results如果您使用async版本,那么您可能需要.get()来等待所有结果

def combine(a,b):
    return b

results = pool.starmap_async(combine, zip(fo,fi))
print(results.get())

if you will use non-async version then you don't need `.get()如果你将使用非异步版本那么你不需要 `.get()

def combine(a,b):
    return b

results = pool.starmap(combine, zip(fo,fi))
print(results)

Processes may print() at the same time and it may mix messages from different processes so it it good to create one string before printing进程可能会同时print() ,它可能会混合来自不同进程的消息,因此最好在打印前创建一个字符串

print(f"Completed: {a} -> {b}")

import multiprocessing as mp

fo = ["folder1","folder2","folder3","folder4"]
fi = ["csvfile1","csvfile2","csvfile3","csvfile4"]

def combine(a,b):
    #YOUR CODE HERE
    print(f"Completed: {a} -> {b}") # it is good to create one string to display all as one string without strings from other processes
    return b

pool = mp.Pool(processes=4)

# async version

results = pool.starmap_async(combine, zip(fo,fi))
# it needs `.get()` because it is `async`
print(results.get())

# non-async version

results = pool.starmap(combine, zip(fo,fi))
# it doesn't needs `.get()` because it is not `async`
print(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM