[英]shared data utilizing multiple processor in python
I have a cpu intensive code which uses a heavy dictionary as data (around 250M data). 我有一个cpu密集的代码,它使用重字典作为数据(大约250M数据)。 I have a multicore processor and want to utilize it so that i can run more than one task at a time.
我有一个多核处理器,并希望利用它,这样我一次可以运行多个任务。 The dictionary is mostly read only and may be updated once a day.
该词典大多是只读的,可能每天更新一次。
How can i write this in python without duplicating the dictionary? 如何在不复制字典的情况下在python中编写这个?
I understand that python threads don't use native threads and will not offer true concurrency. 我知道python线程不使用本机线程,也不会提供真正的并发。 Can i use multiprocessing module without data being serialized between processes?
我可以使用多处理模块而不在进程之间序列化数据吗?
I come from java world and my requirement would be something like java threads which can share data, run on multiple processors and offers synchronization primitives. 我来自java世界,我的要求就像java线程一样,它可以共享数据,在多个处理器上运行并提供同步原语。
You can share read-only data among processes simply with a fork
(on Unix; no easy way on Windows), but that won't catch the "once a day change" (you'd need to put an explicit way in place for each process to update its own copy). 您可以简单地使用
fork
(在Unix上,在Windows上没有简单的方法)在进程之间共享只读数据,但这不会捕获“每天一次更改”(您需要采用明确的方法)每个进程更新自己的副本)。 Native Python structures like dict
are just not designed to live at arbitrary addresses in shared memory (you'd have to code a dict
variant supporting that in C) so they offer no solace. 像
dict
这样的原生Python结构并不是设计用于共享内存中的任意地址(你必须编写支持C语言的dict
变体),因此它们没有提供任何安慰。
You could use Jython (or IronPython) to get a Python implementation with exactly the same multi-threading abilities as Java (or, respectively, C#), including multiple-processor usage by multiple simultaneous threads. 您可以使用Jython(或IronPython)来获得具有与Java(或C#)完全相同的多线程功能的Python实现,包括多个同时线程的多处理器使用。
在stdlib中查看一下: http : //docs.python.org/library/multiprocessing.html有许多很棒的功能可以让你很容易地在进程之间共享数据结构。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.