简体   繁体   English

在循环内包装多进程池(进程之间的共享内存)

[英]Wrap Multiprocess Pool Inside Loop (Shared Memory Between Processes)

I'm using the Python package "deap" to solve some multiobjective optimization problems with genetic algorithms. 我正在使用Python软件包“ deap”来解决遗传算法中的一些多目标优化问题。 The functions can be quite expensive, and because of the evolutionary nature of GA, it gets compounded pretty quick. 这些功能可能非常昂贵,并且由于GA的进化特性,它的合成非常迅速。 Now this package does have some support to allow the evolutionary computations to be parallelized with multiprocess. 现在,此软件包确实提供了一些支持,以允许将演化计算与多进程并行化。

However, I'd like to go one step farther and run the optimization multiple times, with different values on some of the optimization parameters. 但是,我想更进一步,并多次运行优化,在某些优化参数上使用不同的值。 For instance, I might want to solve the optimization problem with different values of the weights. 例如,我可能想用不同的权重值来解决优化问题。

This seems like a pretty natural case for loops, but the problem is that these parameters must be defined in the global scope of the program (ie, above the "main" function) so that all the sub-processes know about the parameters. 对于循环来说,这似乎是很自然的情况,但是问题是这些参数必须在程序的全局范围内定义(即,在“ main”函数上方),以便所有子进程都知道这些参数。 Here's some pseudo-code: 这是一些伪代码:

# define deap parameters - have to be in the global scope
toolbox = base.Toolbox()
history = tools.History()
weights = [1, 1, -1] # This is primarily what I want to vary
creator.create("Fitness",base.Fitness, weights=weights)
creator.create("Individual", np.ndarray, fitness=creator.Fitness)

def main():
    # run GA to solve multiobjective optimization problem
    return my_optimized_values

if __name__=='__main__':
    ## What I'd like to do but can't ##
    ## all_weights =  list(itertools.product([1, -1],repeat=3))
    ## for combo in all_weights:
    ##     weights = combo
    ##
    pool = multiprocessing.Pool(processes=6)
    # This can be down here, and it distributes the GA computations to a pool of workers
    toolbox.register("map",pool.map) 
    my_values = main()

I've investigated various possibilities, like multiprocessing.Value, the pathos fork of multiprocessing, and others, but in the end there's always a problem with the child processes reading the Individual class. 我已经研究了各种可能性,例如multiprocessing.Value,多重处理的悲哀分叉,等等,但是最后,子进程读取Individual类始终存在问题。

I've posed this question on the deap users' group, but it's not nearly as big a community as SO. 我已经向deap用户小组提出了这个问题,但是它的社区不像SO这样大。 Plus, it seems to me that this is more of a general conceptual Python question than a specific issue with deap. 另外,在我看来,这更多是一个概念性的Python问题,而不是关于deap的特定问题。 My current solution to this problem is just to run the code multiple times and change some of the parameter definitions each time. 我当前对这个问题的解决方案是多次运行代码,并每次更改一些参数定义。 At least this way the GA calculations are still parallelized, but it does require more manual intervention than I'd like. 至少通过这种方式,遗传算法的计算仍然可以并行化,但是它确实比我想要的需要更多的人工干预。

Any advice or suggestions are greatly appreciated! 任何意见或建议,不胜感激!

Use the initializer / initargs keyword arguments to Pool to pass different values for the global variables you need to change on each run. 使用Poolinitializer / initargs关键字参数为每次运行时需要更改的全局变量传递不同的值。 The initializer function will be called with initargs as its arguments for each worker process inside of your Pool , as soon as it starts up. initializer函数启动后,将使用initargs作为其参数对Pool内部的每个工作进程进行调用。 You can set your global variables to the desired values there, and they'll be set properly inside each child for the lifetime of the pool. 您可以在此处将全局变量设置为所需的值,并在池的整个生命周期内在每个子代中正确设置它们。

You'll need to create a different Pool for each run, but that shouldn't be a problem: 您需要为每次运行创建一个不同的Pool ,但这应该不是问题:

toolbox = base.Toolbox()
history = tools.History()
weights = None # We'll set this in the children later.



def init(_weights):
    # This will run in each child process.
    global weights
    weights = _weights
    creator.create("Fitness",base.Fitness, weights=weights)
    creator.create("Individual", np.ndarray, fitness=creator.Fitness)


if __name__=='__main__':
    all_weights =  list(itertools.product([1, -1],repeat=3))
    for combo in all_weights:
        weights = combo
        pool = multiprocessing.Pool(processes=6, initializer=init, initargs=(weights,))
        toolbox.register("map",pool.map) 
        my_values = main()
        pool.close()
        pool.join()

I have also been uncomfortable with DEAP's use of global scope, and I think I have an alternate solution for you. 对于DEAP的全局范围使用,我也感到不自在,我想为您提供一个替代解决方案。

It is possible to import a different version of each module per loop iteration, thereby avoiding any reliance on the global scope. 可以在每次循环迭代中导入每个模块的不同版本,从而避免对全局范围的任何依赖。

this_random = importlib.import_module("random")
this_creator = importlib.import_module("deap.creator")
this_algorithms = importlib.import_module("deap.algorithms")
this_base = importlib.import_module("deap.base")
this_tools = importlib.import_module("deap.tools")

As far as I can tell, this seems to play with multiprocessing. 据我所知,这似乎与多处理有关。

As an example, here is a version of DEAP's onemax_mp.py that avoids putting any of the DEAP files in the global scope. 例如,这是DEAP的onemax_mp.py的版本,该版本避免将任何DEAP文件放在全局范围内。 I've included a loop in __main__ that changes the weights per iteration. 我在__main__中包含一个循环,该循环__main__每次迭代的权重。 (It maximizes the number of ones the first time, and minimizes it the second time.) Everything works fine with multiprocessing. (它第一次使数量最大化,而第二次使数量最小。)一切都可以在多处理中正常工作。

#!/usr/bin/env python2.7
#    This file is part of DEAP.
#
#    DEAP is free software: you can redistribute it and/or modify
#    it under the terms of the GNU Lesser General Public License as
#    published by the Free Software Foundation, either version 3 of
#    the License, or (at your option) any later version.
#
#    DEAP is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
#    GNU Lesser General Public License for more details.
#
#    You should have received a copy of the GNU Lesser General Public
#    License along with DEAP. If not, see <http://www.gnu.org/licenses/>.

import array
import multiprocessing
import sys

if sys.version_info < (2, 7):
    print("mpga_onemax example requires Python >= 2.7.")
    exit(1)

import numpy
import importlib


def evalOneMax(individual):
    return sum(individual),


def do_onemax_mp(weights, random_seed=None):
    """ Run the onemax problem with the given weights and random seed. """

    # create local copies of each module
    this_random = importlib.import_module("random")
    this_creator = importlib.import_module("deap.creator")
    this_algorithms = importlib.import_module("deap.algorithms")
    this_base = importlib.import_module("deap.base")
    this_tools = importlib.import_module("deap.tools")

    # hoisted from global scope
    this_creator.create("FitnessMax", this_base.Fitness, weights=weights)
    this_creator.create("Individual", array.array, typecode='b',
                        fitness=this_creator.FitnessMax)
    this_toolbox = this_base.Toolbox()
    this_toolbox.register("attr_bool", this_random.randint, 0, 1)
    this_toolbox.register("individual", this_tools.initRepeat,
                          this_creator.Individual, this_toolbox.attr_bool, 100)
    this_toolbox.register("population", this_tools.initRepeat, list,
                          this_toolbox.individual)
    this_toolbox.register("evaluate", evalOneMax)
    this_toolbox.register("mate", this_tools.cxTwoPoint)
    this_toolbox.register("mutate", this_tools.mutFlipBit, indpb=0.05)
    this_toolbox.register("select", this_tools.selTournament, tournsize=3)

    # hoisted from __main__
    this_random.seed(random_seed)
    pool = multiprocessing.Pool(processes=4)
    this_toolbox.register("map", pool.map)
    pop = this_toolbox.population(n=300)
    hof = this_tools.HallOfFame(1)
    this_stats = this_tools.Statistics(lambda ind: ind.fitness.values)
    this_stats.register("avg", numpy.mean)
    this_stats.register("std", numpy.std)
    this_stats.register("min", numpy.min)
    this_stats.register("max", numpy.max)

    this_algorithms.eaSimple(pop, this_toolbox, cxpb=0.5, mutpb=0.2, ngen=40,
                             stats=this_stats, halloffame=hof)

    pool.close()

if __name__ == "__main__":
    for tgt_weights in ((1.0,), (-1.0,)):
        do_onemax_mp(tgt_weights)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM