简体   繁体   English

在python多处理中从子进程返回大对象

[英]Returning large objects from child processes in python multiprocessing

I'm working with Python multiprocessing to spawn some workers. 我正在使用Python多处理来产生一些工作者。 Each of them should return an array that's a few MB in size. 它们中的每一个都应该返回一个大小为几MB的数组。

  1. Is it correct that since my return array is created in the child process, it needs to be copied back to the parent's memory when the process ends? 是否正确,因为我的返回数组是在子进程中创建的,所以当进程结束时需要将它复制回父进程的内存中? (this seems to take a while, but it might be a pypy issue) (这似乎需要一段时间,但它可能是一个小问题)
  2. Is there a mechanism to allow the parent and child to access the same in-memory object? 是否有一种机制允许父和子访问相同的内存中对象? (synchronization is not an issue since only one child would access each object) (同步不是问题,因为只有一个孩子会访问每个对象)

I'm afraid I have a few gaps in how python implements multi-processing, and trying to persuade pypy to play nice is not making things any easier. 我担心我在python如何实现多处理方面存在一些差距,并且试图说服pypy玩得很好并不会让事情变得更容易。 Thanks! 谢谢!

Yes, if the return array is created in the child process, it must be sent to the parent by pickling it, sending the pickled bytes back to the parent via a Pipe , and then unpickling the object in the parent. 是的,如果在子进程中创建了返回数组,则必须通过pickle将其发送给父进程,通过Pipe将pickle字节发送回父进程,然后在父进程中取消对象。 For a large object, this is pretty slow in CPython, so it's not just a PyPy issue. 对于一个大型对象,这在CPython中相当慢,所以它不仅仅是一个PyPy问题。 It is possible that performance is worse in PyPy, though; 但PyPy中的性能可能更差 ; I haven't tried comparing the two, but this PyPy bug seems to suggest that multiprocessing in PyPy is slower than in CPython. 我没有尝试比较这两个,但是这个PyPy错误似乎表明PyPy中的multiprocessing比CPython慢​​。

In CPython, there is a way to allocate ctypes objects in shared memory, via multiprocessing.sharedctypes . 在CPython中,有一种方法可以通过multiprocessing.sharedctypes在共享内存中分配ctypes对象。 PyPy seems to support this API, too. PyPy似乎也支持这个API。 The limitation (obviously) is that you're restricted to ctypes objects. 限制(显然)是你被限制为ctypes对象。

There is also multiprocessing.Manager , which would allow you to create a shared array/list object in a Manager process, and then both the parent and child could access the shared list via a Proxy object. 还有multiprocessing.Manager ,它允许您在Manager进程中创建共享数组/列表对象,然后父和子都可以通过Proxy对象访问共享列表。 The downside there is that read/write performance to the object is much slower than it would be as a local object, or even if it was a roughly equivalent object created using multiprocessing.sharedctypes . 缺点是对象的读/写性能比作为本地对象要慢得多,或者即使它是使用multiprocessing.sharedctypes创建的大致等效的对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python-子进程被杀死后,多处理队列未按正确顺序返回结果 - Python - Multiprocessing Queue not returning result in correct order after child processes are killed Python Multiprocessing - 为什么我的进程没有返回/完成? - Python Multiprocessing - Why are my processes are not returning/finishing? 如何在python中使用多处理正确终止子进程 - How to properly terminate child processes with multiprocessing in python python 多处理子进程未正常退出 - python multiprocessing child processes not quiting normally Python多重处理:子进程以不同的速度运行 - Python Multiprocessing: Child processes working at different speed Python多重处理-将输入发送到子进程 - Python Multiprocessing - sending inputs to child processes 当我使用 Python 3 multiprocessing.pool 时,哪些对象和变量被复制到子进程(通过酸洗)? - What objects and variables are copied to child processes (by pickling) when I use Python 3 multiprocessing.pool? 在多处理中清理子进程 - Cleanup child processes in multiprocessing 在使用 Python 多处理的子进程内创建子进程失败 - Create child processes inside a child process with Python multiprocessing failed 为什么python多处理pickle对象在进程之间传递对象? - Why does python multiprocessing pickle objects to pass objects between processes?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM