简体   繁体   中英

Python : Parallel execution of function

I would like to execute set of tasks in parallel. I have defined a function in a class which takes the parameter and executes the operation based on parameter. Class structure is like below.

from threading import Thread
from concurrent.futures import *
class Test(object):

  def process_dataframe(self,id:int):

  def run_task(self):
    thd = []
    for i in range(1,10): 
      thd.append( "self.process_dataframe({0})".format(i))

  def run_functions_in_parallel(self,fns)->bool:
    def wrap_function(self,fnToCall):
        return ("0")
      except Exception as e:
        return "{0}".format(e)

    thd = []
    isError = False
    executor = ThreadPoolExecutor(max_workers=len(fns))
    errorMessage = ""

    for fn in fns:     
      t = executor.submit(wrap_function,self,fn)

    for td in thd:
      ret = td.result()
      if ret != "0":
        isError = True
        errorMessage = errorMessage + """
        """ + ret
    if isError == True:
      print (errorMessage)
      raise Exception (errorMessage)
      return True


I have managed to make it work and tasks are executing properly. I am wondering whether there is better/simpler way to accomplish the same. I would like to keep run_functions_in_parallel method generic so that it can be used as common method in a module.

You don't need to use a wrapper, since ThreadPoolExecutor catches errors in a better way. A function, that always returns True or raises an error, don't need a return value, but if you have functions with return values, you want to call in parallel, you should return their results. It is a bad idea to use a magic string as indicator for errors. format(e) of a KeyError: 0 also leads to "0" . Better use a unique value, like None in our case. Don't use eval if you don't have to. In your case, you can use partial . Don't use a to large value for max_workers .

from functools import partial
from concurrent.futures import ThreadPoolExecutor

class Test(object):
    def process_dataframe(self, id):

    def run_task(self):
        functions = []
        for i in range(1,10): 
            functions.append(partial(self.process_dataframe, i))

    def run_functions_in_parallel(self, functions, max_workers=8):
        executor = ThreadPoolExecutor(max_workers=max_workers)
        futures = [
            for function in functions

        errors = []
        results = []
        for future in futures:
                result = future.result()
            except Exception as e:
        if errors:
            raise Exception(errors)
        return results

d = Test()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM