简体   繁体   English

Python多重处理-共享ID的单独进程中的全局变量?

[英]Python multiprocessing--global variables in separate processes sharing id?

From this question I learned that: 这个问题中我了解到:

When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. 当您使用多重处理打开第二个进程时,将创建一个具有自己全局状态的Python全新实例。 That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process. 该全局状态不会共享,因此子进程对全局变量所做的更改对于父进程将是不可见的。

To verify this behavior, I made a test script: 为了验证此行为,我做了一个测试脚本:

import time
import multiprocessing as mp
from multiprocessing import Pool
x = [0]  # global
def worker(c):
    if c == 1:  # wait for proc 2 to finish; is global x overwritten by now?
        time.sleep(2)
    print('enter: x =', x, 'with id', id(x), 'in proc', mp.current_process())
    x[0] = c
    print('exit: x =', x, 'with id', id(x), 'in proc', mp.current_process())
    return x[0]

pool = Pool(processes=2)
x_vals = pool.map(worker, [1, 2])
print('parent: x =', x, 'with id', id(x), 'in proc', mp.current_process())
print('final output', x_vals)

The output (on CPython) is something like 输出(在CPython上)类似于

enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
exit: x = [2] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
exit: x = [1] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
parent: x = [0] with id 140138406834504 in proc <_MainProcess(MainProcess, started)>
final output [1, 2]

How should I explain the fact that the id of x is shared in all the processes, yet x takes different values? 我应该如何解释xid在所有进程中共享但x取不同值的事实? Isn't id conceptually the memory address of a Python object ? 从概念上讲, id不是Python对象的内存地址吗? I guess this is possible if the memory space gets cloned in the child processes. 我想如果在子进程中克隆内存空间,这是可能的。 Then is there something I can use to get the actual physical memory address of a Python object? 那有什么我可以用来获取Python对象的实际物理内存地址的东西吗?

Shared State 共享状态

When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. 当您使用多重处理打开第二个进程时,将创建一个具有自己全局状态的Python全新实例。 That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process. 该全局状态不会共享,因此子进程对全局变量所做的更改对于父进程将是不可见的。

The crucial point here seems to be: 这里的关键点似乎是:

That global state is not shared..." 这种全球状态不会共享...”

...refering to that global state of the child process. ... 的是子进程的全局状态。 But that doesn't mean that part of the global state from the parent can't be shared with the child process as long the child process doesn't attempt to write to this part. 但这并不意味着父级的全局状态的一部分不能与子进程共享,只要子进程不尝试写入部分。 When this happens, this part get's copied and changed and will not be visible to the parent. 发生这种情况时,将复制并更改部分获取,并且父级将看不到部分。

Background: 背景:

On Unix 'fork' is the default way for starting the child process: 在Unix上, “ fork”是启动子进程的默认方式:

The parent process uses os.fork() to fork the Python interpreter. 父进程使用os.fork()派生Python解释器。 The child process, when it begins, is effectively identical to the parent process. 子进程开始时实际上与父进程相同。 All resources of the parent are inherited by the child process. 父进程的所有资源均由子进程继承。 Note that safely forking a multithreaded process is problematic. 请注意,安全地分叉多线程进程是有问题的。

Available on Unix only. 仅在Unix上可用。 The default on Unix. Unix上的默认值。

Fork is implemented using copy-on-write , so unless you assign a new object to x no copying takes place and the child process shares the same list with its parent. Fork是使用写时复制实现的,因此除非您为x分配新对象,否则不会进行复制,并且子进程与其父进程共享相同的列表。


Memory address 内存地址

How should I explain the fact that the id of x is shared in all the processes, yet x takes different values? 我应该如何解释x的ID在所有进程中共享但x取不同值的事实?

Fork creates a child process in which the virtual address space is identical to the virtual address space of the parent. Fork创建一个子进程,其中虚拟地址空间与父进程的虚拟地址空间相同。 The virtual addresses will all map to the same physical addresses until copy-on-write occurs. 虚拟地址将全部映射到相同的物理地址,直到发生写时复制。

Modern OSes use virtual addressing. 现代操作系统使用虚拟寻址。 Basically the address values (pointers) you see inside your program are not actual physical memory locations, but pointers to an index table (virtual addresses) that in turn contains pointers to the actual physical memory locations. 基本上,您在程序内部看到的地址值(指针)不是实际的物理内存位置,而是指向索引表(虚拟地址)的指针,而索引表又包含指向实际的物理内存位置的指针。 Because of this indirection, you can have the same virtual address point to different physical addresses IF the virtual addresses belong to index tables of separate processes. 由于这种间接方式,如果虚拟地址属于不同进程的索引表,则可以将同一虚拟地址指向不同的物理地址。 link 链接


Then is there something I can use to get the actual physical memory address of a Python object? 那有什么我可以用来获取Python对象的实际物理内存地址的东西吗?

There doesn't seem to be a way to get the actual physical memory address ( link ). 似乎没有办法获取实际的物理内存地址( link )。 id returns the virtual (logical) memory address (CPython). id返回虚拟 (逻辑)内存地址(CPython)。 The actual translation from virtual to physical memory address falls to the MMU . 从虚拟内存到物理内存地址的实际转换属于MMU

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM