简体   繁体   中英

Run a class method in parallel processes in python

I am trying and failing to run a huge loop in parallel. The loop is exactly one method of a specific class, and inside the loop I call its another method. It does work, but for some reason there is only one process in the list and the output (see code) is always 'Worker 0'. Either the processes are not created or they are not running in parallel. The structure is the following:

main.py

from my_class.py import MyClass

def main():
    class_object = MyClass()
    class_object.method()

if __name__ == '__main__':
    main()

my_class.py

from multiprocessing import Process

MyClass(object):
    def __init__(self):
        # do something

    def _method(self, worker_num, n_workers, amount, job, data):
        for i, val in enumerate(job):
            print('Worker %d' % worker_num)
            self.another_method(val, data)

    def another_method(self):
        # do something to the data

    def method(self):
        # definitions of data and job_size go here

        n_workers = 16
        chunk = job_size // n_workers
        resid = job_size - chunk * n_workers

        workers = []
        for worker_num in range(n_workers):
            st = worker_num * chunk
            amount = chunk if worker_num != n_workers - 1 else chunk + resid
            worker = Process(target=self._method, args=[worker_num, n_workers, amount, job[st:st+amount], data])
            worker.start()
            workers.append(worker)

        for worker in workers:
            worker.join()

        return data

I have read some things about child processes requiring main module to be importable, but I have no idea how to do it in my case.

Question : ... but still only one core is in use. So the question is, can I use multiple cores with Process objects

This does not depend on the Python interpreter which Process is using which CPU.
Relevant: on-what-cpu-cores-are-my-python-processes-running

Extend your def _method(... with the following, to see what actually happens:

Note : getpidcore(pid) is Distribution dependend, could FAIL !

def getpidcore(pid):
    with open('/proc/{}/stat'.format(pid), 'rb') as fh:
        core = int(fh.read().split()[-14])
        return core

class MyClass(object): 
    ...
    def _method(self, worker_num, n_workers, amount, job, data):
        for i, val in enumerate(job):
            core = getpidcore(os.getpid())
            print('core:{} pid:{} Worker({})'.format(core, os.getpid(), (worker_num, n_workers, amount, job)))

Output :

 core:1 pid:7623 Worker((0, 16, 1, [1])) core:1 pid:7625 Worker((2, 16, 1, [3])) core:0 pid:7624 Worker((1, 16, 1, [2])) core:1 pid:7626 Worker((3, 16, 1, [4])) core:1 pid:7628 Worker((5, 16, 1, [6])) core:0 pid:7627 Worker((4, 16, 1, [5])) 

Tested with Python: 3.4.2 on Linux

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM