简体   繁体   English

python多处理在不同的python进程之间共享数据

[英]python multiprocessing sharing data between separate python processes

Multiprocessing allows me to share data between processes started from within the same python runtime interpreter. 多处理允许我在同一个python运行时解释器中启动的进程之间共享数据。 But what if i had a need to share data between processes started by separate python runtime processes? 但是如果我需要在由单独的python运行时进程启动的进程之间共享数据呢? I was looking at multiprocessing.Manager which seems to be the right construct for it. 我在看multiprocessing.Manager,它似乎是正确的构造。 If I create a manager i can see its address: 如果我创建一个经理,我可以看到它的地址:

>>> from multiprocessing import Manager
>>> m=Manager()
>>> m.address
'/tmp/pymp-o2TCd_/listener-Qld03B'

And the socket is there: 插座就在那里:

adrian@sammy ~/temp $ netstat -naA unix | grep pymp
unix  2      [ ACC ]     STREAM     LISTENING     1220401  /tmp/pymp-     o2TCd_/listener-Qld03B

If I start a new process with multiprocessing.Process it spawns a new python interpreter that somehow inherits information about these shared constructs like this Manager. 如果我使用multiprocessing.Process启动一个新进程,它会生成一个新的python解释器,它以某种方式继承有关这些共享结构的信息,如此Manager。 Is there a way to access it from a new python process NOT spawned from the same one that created the Manager? 有没有办法从新的python进程访问它而不是从创建Manager的同一个进程中生成?

You are on the (or a) right track with this. 你正处于(或)正确的轨道上。

In a comment, stovfl suggests looking at the remote manager section of the Python multiprocessing Manager documentation ( Python2 , Python3 ). 评论中,stovfl建议查看Python多处理Manager文档( Python2Python3 )的远程管理器部分。 As you have observed, each manager has a name-able entity (a socket in /tmp in this case) through which each Python process can connect to a peer Python process. 正如您所观察到的,每个管理器都有一个可命名的实体(在本例中为/tmp中的套接字),每个Python进程都可以通过它连接到对等的Python进程。 Because these are accessible from any process, however, they each have an access key. 但是,因为它们可以从任何进程访问,所以它们每个都有一个访问密钥。

The default key for each Manager is the one for the "main process", and it is a string of 32 random bytes : 每个Manager的默认密钥是“主进程”的默认密钥,它是一个32个随机字节的字符串

class _MainProcess(BaseProcess):

    def __init__(self):
        self._identity = ()
        self._name = 'MainProcess'
        self._parent_pid = None
        self._popen = None
        self._config = {'authkey': AuthenticationString(os.urandom(32)),
                        'semprefix': '/mp'}
        # Note that some versions of FreeBSD only allow named
        # semaphores to have names of up to 14 characters.  Therefore
        # we choose a short prefix.
        #
        # On MacOSX in a sandbox it may be necessary to use a
        # different prefix -- see #19478.
        #
        # Everything in self._config will be inherited by descendant
        # processes.

but you may assign your own key, which you can then know and therefore use from anywhere else. 但您可以指定自己的密钥,然后您可以知道这些密钥,因此可以在其他任何地方使用。

There are other ways to handle this. 还有其他方法可以解决这个问题。 For instance, you can use XML RPC to export callable functions from one Python process, callable from anything—not just Python—that can speak XML RPC. 例如,您可以使用XML RPC从一个Python进程导出可调用函数,可以从任何可调用XML RPC的Python调用。 See the Python2 or Python3 documentation. 请参阅Python2Python3文档。 Heed this warning (this is the py3k variant but it applies in py2k as well): 注意这个警告(这是py3k变体,但它也适用于py2k):

Warning : The xmlrpc.client module is not secure against maliciously constructed data. 警告xmlrpc.client模块对于恶意构造的数据不安全。 If you need to parse untrusted or unauthenticated data see XML vulnerabilities. 如果需要解析不受信任或未经身份验证的数据,请参阅XML漏洞。

Do not, however, assume that using a multiprocessing.Manager instead of XML RPC secures you against maliciously constructed data. 但是,不要假设使用multiprocessing.Manager而不是XML RPC可以保护您免受恶意构造的数据的侵害。 Those are just as vulnerable since they will unpickle arbitrary data. 这些都是脆弱的,因为它们会破坏任意数据。 See Attacking Python's pickle for more about this. 有关此内容的更多信息,请参阅攻击Python的pickle

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM