简体   繁体   中英

Managing a fixed number of workers in Python

I need to implement a system with a master process that manages slave processes that perform other tasks. I have two different slave types and want 6 instances of each slave. I've written something that will work but it kills each process and starts a new one when the task completes. This isn't desirable because spawning the new process is expensive. I'd prefer to keep each slave running as a process and get notified when it's done and run it again with new input.

My current pseudo-ish code is below. It isn't perfect; I am winging it because I don't have the actual code with me.

# SlaveTypeB is pretty much the same.
class SlaveTypeA(multiprocessing.Process):
    def __init__(self, val):
        self.value = val
        self.result = multiprocessing.Queue(1)
        self.start()
    def run(self):
        # In real life, run does something that takes a few seconds.
        sleep(2)
        # For SlaveTypeB, assume it writes self.val to a file instead of incrementing
        self.result.put(self.val + 1)
    def getResult(self):
        return self.result.get()[0]


if __name__ == "__main__":
    MAX_PROCESSES = 6
    # In real life, the input will grow as the while loop is being processed
    input = [1, 4, 5, 6, 9, 6, 3, 3]
    aProcessed = []
    aSlaves = []
    bSlaves = []

    while len(input) > 0 or len(aProcessed) > 0:
        if len(aSlaves) < MAX_PROCESSES and len(input) > 0:
            aSlaves.append(SlaveTypeA(input.pop(0))
        if len(bSlaves) < MAX_PROCESSES and len(aProcessed) > 0 :
            bSlaves.append(SlaveTypeB(aProcesssed.pop(0))
        for aSlave in aSlaves:
            if not aSlave.isAlive():
                aProcessed = aSlave.getResult()
                aSlaves.remove(aSlave)
        for bSlave in bSlaves:
            if not bSlave.isAlive():
                bSlaves.remove(bSlave)

How can I make it so that the processes in aSlaves and bSlaves aren't killed and respawned. I'm thinking I could use a pipe, but I'm not sure how I could tell when the process is done blocking without having to wait.

EDIT I rewrote this using pipes and it solved my issue with not being able to keep processes running. Still would like input on the best way to do this. I left out the slaveB part since having just one worker type simplifies the problem.

class Slave(Process)
    def __init__(self, id):
        # Call super init, set id, set idlestate = true, etc
        self.parentCon, self.childCon = Pipe()
        self.start()

    def run(self):
        while True:
            input = self.childCon.recv()
            # Do something here in real life
            sleep(2)
            self.childCon.send(input + 1)

   #def isIdle/setIdle():
       # Getter/setter for idle

   def tryGetResult(self):            
       if self.parentCon.poll():
           return self.parentCon.recv()
       return False

   def process(self, input):
       self.parentConnection.send(input)

if __name__ == '__main__'
    MAX_PROCESSES = 6
    jobs = [1, 4, 5, 6, 9, 6, 3, 3]
    slaves = []
    for int i in range(MAX_PROCESSES):
        slaves.append(Slave(i))
    while len(jobs) > 0:
        for slave in slaves:
            result = slave.tryGetResult()
            if result:
                # Do something with result
                slave.setIdle(True)
            if slave.isIdle():
                slave.process(jobs.pop())
                slave.setIdle(False) 

EDIT 2 Got it, see answer below.

Create two Queues? Like worktodoA and worktodoB and have your workers idle while they wait for something to be put in the queue and if the item put there is lets say 'quit' they will quit?

Otherwise you should give tMCs comment a shot

Looks like using a SyncManager was the best choice for this situation.

class Master(SyncManager):
    pass

input = [1, 4, 5, 6, 9, 6, 6, 3, 3]
def getNextInput():
    # Check that input isn't empty first
    return input.pop()

if __name__ == "__main__":
    MAX_PROCESSES = 6
    Master.register("getNextInput", getNextInput)
    m = Master(('localhost', 5000))
    m.start()
    for i in range(MAX_PROCESSES):
        Slave()
    while True:
        pass

class Slave(Process):
    def __init__(self):
        multiprocessing.Process.__init__(self)
        self.start()
    def run(self):
        Master.register("getNextInput", getNextInput)
        m = Master(('localhost', 5000))
        m.connect()
        while True:
            input = m.getNextInput()
            # Check for None first
            self.process(input)
    def process(self):
        print "Processed " + str(input)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM