简体   繁体   中英

Transferring modules between two processes with python multiprocessing

So i have a problem. I'm trying to make my imports faster, so i started using multiprocessing module to split a group of imports into two functions, and then run each on separate core, thus speeding the imports up. But now the code will not recognize the modules at all. What am I doing wrong ?

import multiprocessing


def core1():
    import wikipedia
    import subprocess
    import random
    return wikipedia, subprocess, random



def core2():
    from urllib import request
    import json
    import webbrowser
    return request, json, webbrowser


if __name__ == "__main__":
    start_core_1 = multiprocessing.Process(name='worker 1', target=core1, args = core2())
    start_core_2 = multiprocessing.Process(name='worker 2', target=core2, args = core1())
    start_core_1.start()
    start_core_2.start()

while True:
    user = input('[!] ')
    with request.urlopen('https://api.wit.ai/message?v=20160511&q=%s&access_token=Z55PIVTSSFOETKSBPWMNPE6YL6HVK4YP' % request.quote(user)) as wit_api:  # call to wit.ai api
        wit_api_html = wit_api.read()
        wit_api_html = wit_api_html.decode()
        wit_api_data = json.loads(wit_api_html)
    intent = wit_api_data['entities']['Intent'][0]['value']
    term = wit_api_data['entities']['search_term'][0]['value']
    if intent == 'info_on':
        with request.urlopen('https://kgsearch.googleapis.com/v1/entities:search?query=%s&key=AIzaSyCvgNV4G7mbnu01xai0f0k9NL2ito8vY6s&limit=1&indent=True' % term.replace(' ', '%20')) as response:
            google_knowledge_base_html = response.read()
            google_knowledge_base_html = google_knowledge_base_html.decode()
            google_knowledge_base_data = json.loads(google_knowledge_base_html)
            print(google_knowledge_base_data['itemListElement'][0]['result']['detailedDescription']['articleBody'])
    else:
        print('Something')

I think you are missing the important parts of the whole picture ie crucial parts of what you need to know about multiprocessing when using it.

Here are some crucial parts that you have to know and then you will understand why you can't just import modules in child process and speed up the thing. Even returning loaded modules is not a perfect answer too.

First, when you use multiprocess.Process a child process is forked (on Linux) or spawned (on Windows). I'll assume you are using Linux. In that case, every child process inherits every loaded module from parent (global state). When child process changes anything, like global variables or imports new modules, those stay just in its context. So, parent process is not aware of it. I believe part of this can also be of interest.

Second, module can be a set of classes, external lib bindings, functions, etc. and some of them quite probably can't be pickled, at least with pickle . Here is the list of what can be pickled in Python 2.7 and in Python 3.X . There are even libraries that give you 'more pickling power' like dill . However, I'm not sure pickling whole modules is a good idea at all, not to mention that you have slow imports and yet you want to serialize them and send them to parent process. Even if you manage to do it, it doesn't sound like a best approach.

Some of the ideas on how to change the perspective:

  1. Try to revise which module you need and why? Maybe you can use other modules that can give you similar functionalities. Maybe these modules are overweighing and bringing too much with them and cost is great in comparing to what you get.

  2. If you have slow loading of modules, try to make a script that will always be running, so you do not have to run it multiple times.

  3. If you really need those modules maybe you can separate their using in two processes and then each process does it's own thing. Example would be, one process parses page, other process processes and so on. That way you sped up the loading but you have to deal with passing messages between processes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM