简体   繁体   中英

Using Magic command with python Multiprocessing library

I'm trying to run Jupyter notebook file for each inputs in the python list from another notebook I've used Jupyter Notebook's magic command %run to accomplish the task

input_list= [1,  131,  312,  327,  348,  485,  469, 1218, 1329, 11212]
for i in input_list:
    try:
        input = i
        !run ./notebook.ipynb 
    except:
        pass

Code is working but the execution time is very high So I decided to use Multiprocessing Libraries with the code to execute the code faster

function using inside multiprocessing

def function(i):
    try:
        input = i
        print(input)#print the current element passed
        %run ./notebook.ipynb
    except:
        pass

multiproccessing code

    from multiprocessing import Pool, cpu_count
    from tqdm import tqdm

    p = Pool(8)

    tqdm(p.imap(function, input_list))

    p.close()
    p.join()

But problem here is the argument that is passed to Function is not passed to notebook used in %run magic command

I got a error like "input is not defined"

What would be a possible solution for this problem?

It works when you follow the guide here to how to use arguments.
Illustrating with a minimal working example .

Make a notebook called add3.ipynb with the following contents as the only cell in it:

o = i + 3
print (f"where the input is {i}; the  output is {o}\n")

Then for your notebook to control the running with various values like you want, use in a code cell the following:

# based on https://pymotw.com/3/multiprocessing/basics.html
import multiprocessing

def worker(i):
    try:
        print (f"input is {i}\n")#print the current element passed
        %run ./add3.ipynb
    except:
        pass
    

input_list= [1,  131,  312,  327,  348,  485,  469, 1218, 1329, 11212]


if __name__ == '__main__':
    jobs = []
    for i in input_list:
        p = multiprocessing.Process(target=worker, args=(i,))
        jobs.append(p)
        p.start()

I'll paste a typical run of that at the bottom of this post.


I still suggest you use papermill to do this so you can parameterize the notebook and then save the files with the new versions, as if a report.

Alternatively, you can use other means to inject code or construct the notebook to run with the input value. A lot of the times I use a template in string from inside a script with a placeholder for the value. Then I run the script to generate the notebooks with the value in them using string.replace() method, save the resulting strings as notebook files, and then run those notebooks using jupytext or jupyter nbconvert . nbformat can be useful for building such a notebook file too. That way you can generate reports in notebook form with the results from each run.

Also, if you don't need the code your calling to be in a notebook, it is often more convenient to save it as a python script (ending in .py ) or an ipython script (ending in .ipy ). (The latter allows you to use IPython magics in a script and is often an easier way to develop when you are used to Jupyter. However, the resulting script runs much slower then pure Python and so I usually end up converting to pure Python and only use the .ipy form early in development.) For example, the contents of the one cell in my example add3.ipynb could simply have been a script add3.py saved. And then from in a notebook I can run it like the following (leaving out multiprocessing for sake of simplicity):

input_list= [1,  131,  312,  327,  348,  485,  469, 1218, 1329, 11212]
for i in input_list:
    %run -i add3.py

Note the use of the -i option with %run to "run the file in IPython's namespace instead of an empty one." Note that option isn't necessary when using %run to run another notebook, because as by default, it's as if you are running the other notebook in the calling the notebook. I like the greater flexibility using %run in conjunction with a script because often I don't want the script running in the same namespace. The alternatives I mentioned (papermill, jupytext, &jupyter nbconvert) to execute an external notebook separate from the current namepsace.


Result seen when running the minimal working example:

input is 1

input is 131

input is 312
input is 327


input is 348
input is 485


input is 469
input is 1218


input is 11212
input is 1329

where the input is 131; the  output is 134


where the input is 1; the  output is 4
where the input is 312; the  output is 315
where the input is 327; the  output is 330


where the input is 485; the  output is 488


where the input is 1218; the  output is 1221
where the input is 469; the  output is 472


where the input is 348; the  output is 351

where the input is 1329; the  output is 1332

where the input is 11212; the  output is 11215

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM