简体   繁体   中英

Using Python multiprocessing library inside nested objects

I'm trying to use the multiprocessing library to parallelize some expensive calculations without blocking some others, much lighter. The both need to interact through some variables, although the may run with different paces.

To show this, I have created the following example, that works fine:

import multiprocessing
import time
import numpy as np


class SumClass:

    def __init__(self):

        self.result = 0.0
        self.p = None
        self.return_value = None

    def expensive_function(self, new_number, return_value):

        # Execute expensive calculation
        #######
        time.sleep(np.random.random_integers(5, 10, 1))
        return_value.value = self.result + new_number
        #######

    def execute_function(self, new_number):

        print(' New number received: %f' % new_number)
        self.return_value = multiprocessing.Value("f", 0.0, lock=True)
        self.p = multiprocessing.Process(target=self.expensive_function, args=(new_number, self.return_value))
        self.p.start()

    def is_executing(self):

        if self.p is not None:

            if not self.p.is_alive():
                self.result = self.return_value.value
                self.p = None
                return False

            else:
                return True

        else:
            return False


if __name__ == '__main__':

    sum_obj = SumClass()
    current_value = 0

    while True:

        if not sum_obj.is_executing():

            # Randomly determine whether the function must be executed or not
            if np.random.rand() < 0.25:
                print('Current sum value: %f' % sum_obj.result)
                new_number = np.random.rand(1)[0]
                sum_obj.execute_function(new_number)

        # Execute other (light) stuff
        #######
        print('Executing other stuff')
        current_value += sum_obj.result * 0.1
        print('Current value: %f' % current_value)
        time.sleep(1)
        #######

Basically, in the main loop some light function is executed, and depending on a random condition, some heavy work is sent to another process if it has already finished the previous one, carried out by an object which needs to store some data between executions. Although expensive_function needs some time, the light function keeps on executing without being blocked.

Although the above code gets the job done, I'm wondering: is it the best/most appropriate method to do this?

Besides, let us suppose the class SumClass has an instance of another object, which also needs to store data. For example:

import multiprocessing
import time
import numpy as np


class Operator:

    def __init__(self):

        self.last_value = 1.0

    def operate(self, value):

        print('    Operation, last value: %f' % self.last_value)
        self.last_value *= value
        return self.last_value


class SumClass:

    def __init__(self):

        self.operator_obj = Operator()
        self.result = 0.0

        self.p = None
        self.return_value = None

    def expensive_function(self, new_number, return_value):

        # Execute expensive calculation
        #######
        time.sleep(np.random.random_integers(5, 10, 1))

        # Apply operation
        number = self.operator_obj.operate(new_number)

        # Apply other operation
        return_value.value = self.result + number
        #######

    def execute_function(self, new_number):

        print('    New number received: %f' % new_number)
        self.return_value = multiprocessing.Value("f", 0.0, lock=True)
        self.p = multiprocessing.Process(target=self.expensive_function, args=(new_number, self.return_value))
        self.p.start()

    def is_executing(self):
        if self.p is not None:
            if not self.p.is_alive():
                self.result = self.return_value.value
                self.p = None
                return False
            else:
                return True
        else:
            return False


if __name__ == '__main__':

    sum_obj = SumClass()
    current_value = 0

    while True:

        if not sum_obj.is_executing():

            # Randomly determine whether the function must be executed or not
            if np.random.rand() < 0.25:
                print('Current sum value: %f' % sum_obj.result)
                new_number = np.random.rand(1)[0]
                sum_obj.execute_function(new_number)

        # Execute other (light) stuff
        #######
        print('Executing other stuff')
        current_value += sum_obj.result * 0.1
        print('Current value: %f' % current_value)
        time.sleep(1)
        #######

Now, inside the expensive_function , a function member of the object Operator is used, which needs to store the number passed.

As expected, the member variable last_value does not change, ie it does not keep any value.

Is there any way of doing this properly?

I can imagine I could arrange everything so that I only need to use one class level, and it would work well. However, this is a toy example, in reality there are different levels of complex objects and it would be hard.

Thank you very much in advance!

from concurrent.futures import ThreadPoolExecutor
from numba import jit
import requests
import timeit


def timer(number, repeat):
    def wrapper(func):
        runs = timeit.repeat(func, number=number, repeat=repeat)
        print(sum(runs) / len(runs))
    return wrapper


URL = "https://httpbin.org/uuid"

@jit(nopython=True, nogil=True,cache=True)
def fetch(session, url):
    with session.get(url) as response:
        print(response.json()['uuid'])


@timer(1, 1)
def runner():
    with ThreadPoolExecutor(max_workers=25) as executor:
        with requests.Session() as session:
            executor.map(fetch, [session] * 100, [URL] * 100)
            executor.shutdown(wait=True)
            executor._adjust_thread_count

Maybe this might help.

I'm using ThreadPoolExecutor for multithreading. you can also use ProcessPoolExecutor.

For your compute expensive operation you can use numba for making cached byte code of your function for faster exeution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM