简体   繁体   中英

Python execute a function in parallel in loop

I tried to improve the execution time of a script which import datas from CSV into Graphite/Go-Carbon DB time series.

this is the loop which parse all zipfiles and read them in function (execute_run): It tried this code but i got an error:

    for idx4, Lst_f in enumerate(full_csvfile_paths):
       if lst_metrics in Lst_f:
          zip_file = Lst_f
          with zipfile.ZipFile(zip_file) as zipobj:
             print("Using ZipFile:",zipobj.filename)
             #execute_run(zipobj.filename, confcsv_path, storage_type, serial)
             output = subprocess.run(execute_run(zipobj.filename, confcsv_path, storage_type, serial),stdout=subprocess.PIPE)
             print ("Return code: %i" % output.returncode)
             print ("Output data: %s" % output.stdout)

Error:

Traceback (most recent call last):
  File "./02-pickle-client.py", line 451, in <module>
    main()
  File "./02-pickle-client.py", line 361, in main
    output = subprocess.run(execute_run(zipobj.filename, confcsv_path, storage_type, serial),stdout=subprocess.PIPE)
  File "/usr/lib64/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1240, in _execute_child
    args = list(args)
TypeError: 'NoneType' object is not iterable

Is there a way to execute X times the function:" execute_run " and control the correct running.

Many thanks for help.

The problem could be that the parallel processes is not set up to handle iterables correctly. Instead of subprocess.run , I would recommend using multiprocessing.pool or multiprocessing.starmap as specified in these docs.

This could look something like this:

    import multiprocessing as mp

    # Step 1: Use multiprocessing.Pool() and specify number of cores to use (here I use 4).
    pool = mp.Pool(4)

    # Step 2: Use pool.starmap which takes a multiple iterable arguments
    results = pool.starmap(My_Function, [(variable1,variable2,variable3) for i in data])
    
    # Step 3: Don't forget to close
    pool.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM