[英]Parallelizing using cython
Is there a way the code below can be parallelized? 有什么办法可以并行化下面的代码? I looked into cyton's prange, but couldn't figure out how it works.
我调查了cyton的阴谋,但不知道它是如何工作的。 Does the prange parallelize the internal loops on different cores?
prange是否使不同内核上的内部循环并行化? For the code below how can I parallelize it?
对于下面的代码,我如何使其并行化?
@cython.boundscheck(False)
def gs_iterate_once(double[:,:] doc_topic,
double[:,:] topic_word,
double[:] topic_distribution,
double[:] topic_probabilities,
unsigned int[:,:] doc_word_topic,
int num_topics):
cdef unsigned int doc_id
cdef unsigned int word_id
cdef unsigned int topic_id
cdef unsigned int new_topic
for i in xrange(doc_word_topic.shape[0]):
doc_id = doc_word_topic[i, 0]
word_id = doc_word_topic[i, 1]
topic_id = doc_word_topic[i, 2]
doc_topic[doc_id, topic_id] -= 1
topic_word[topic_id, word_id] -= 1
topic_distribution[topic_id] -= 1
for j in xrange(num_topics):
topic_probabilities[j] = (doc_topic[doc_id, j] * topic_word[j, word_id]) / topic_distribution[j]
new_topic = draw_topic(np.asarray(topic_probabilities))
doc_topic[doc_id, new_topic] += 1
topic_word[new_topic, word_id] += 1
topic_distribution[new_topic] += 1
# Set the new topic
doc_word_topic[i, 2] = new_topic
prange uses OpenMP that is indeed shared-memory parallelism . prange使用的OpenMP确实是共享内存并行性 。 So, on a single computer it will create threads that run on the different cores available, with access to the same pool of memory.
因此,在一台计算机上,它将创建在可用的不同内核上运行的线程,并可以访问相同的内存池。
For the routine that you show, the first step is to understand what part can be parallelized. 对于您显示的例程,第一步是了解可以并行化哪些部分。 Typically, with data using as first index
i
, operating only on element i
and not, say, i-1
or i+1
, makes the problem parallelizable. 通常,将数据用作第一索引
i
,仅对元素i
,而不对i-1
或i+1
,使问题可并行化。 This is not the case here, so you need to find a way to make the computation more independent. 这里不是这种情况,因此您需要找到一种使计算更加独立的方法。
Actually finding the specific parallel pattern is beyond a SO answer but I'll mention a few tips: 实际上,找到特定的并行模式超出了SO的答案,但是我会提到一些技巧:
with gil
block. with gil
块的一部分时,可以进行Python调用。 @PierredeBuyl's answer gives a good outline of what prange
does and how to use it. @PierredeBuyl的答案很好地概述了
prange
功能以及如何使用它。
This is more a few specific comments relating to your code: 这是与您的代码有关的一些特定注释:
You can't parallelize the outer loop: 您不能并行化外循环:
doc_topic[doc_id, topic_id] -= 1
and the similar ones for other variables and for +=1
. 以及其他变量和
+=1
的相似值。 These modify a variable that is shared between all the loops, and are going to cause inconsistent results. 它们修改了所有循环之间共享的变量,并且将导致不一致的结果。
A similar problem exists with topic_probabilities[j] = ...
if you're parallelizing the outer loop. 如果要并行化外部循环,则
topic_probabilities[j] = ...
存在类似问题。
You could easily parallelize the inner loop for j in xrange(num_topics):
- this only modifies stuff that depends on the index 'j' so there's no issue with the threads fighting to modify the same data. 您可以轻松地
for j in xrange(num_topics):
并行化for j in xrange(num_topics):
的内部循环for j in xrange(num_topics):
-这只会修改依赖于索引'j'的内容,因此与线程争用修改相同数据没有问题。 (However, there's a performance cost each time you start a multithreaded region, so you usually try to parallelize the outer loop instead to avoid this - depending on the size of the arrays you may not gain much) (但是,每次启动多线程区域都会有性能损失,因此通常尝试并行化外循环来避免这种情况-根据数组的大小,您可能不会获得多少收益)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.