简体   繁体   English

CRF 的多线程向前/向后

[英]Multithreading Forwards/Backwards for a CRF

I am attempting to multithread the forwards-backwards algorithm to find marginal probabilities.我正在尝试多线程前向后算法以找到边际概率。 This will be used as a submodule for training a CRF.这将用作训练 CRF 的子模块。 The below is pseudocode for the forwards segment of CRF training on a single example (sourced from here ).下面是针对单个示例的 CRF 训练的前向部分的伪代码(来自此处)。

for i in 0 .. T-1:
    for j in 1 ... N:
        for k in 1 ... N:
            p = alpha[(i-1)][k] + trans[k][j] + obvs[j][i]
            alpha[i][j] = logadd(alpha[i][j], p)

My plan for using N threads (reasonable for my use case with N being at most 10) to compute columns of the alpha matrix independently, and syncing with a barrier upon completion of computing each row actually runs SLOWER than the original sequential code.我计划使用 N 个线程(对于我的用例而言,N 最多为 10)来独立计算 alpha 矩阵的列,并在完成计算每一行后与屏障同步实际上比原始顺序代码运行得慢。

I believe that the overhead of synchronization mechanisms is the root of this problem, as I am using threads from a pool and reusable barriers for all operations.我相信同步机制的开销是这个问题的根源,因为我使用池中的线程和所有操作的可重用屏障。 Is there a better design that I should consider, or does the small size of N not justify parallelizing on computing columns of alpha?是否有更好的设计我应该考虑,或者 N 的小尺寸是否不能证明并行计算 alpha 列是合理的?

The reason for my slowdown was using the multithreading library instead of the multiprocessing library from Python.我变慢的原因是使用多线程库而不是 Python 中的多处理库。 Due to the nature of the GIL (Global Interpreter Lock), only one thread was actually running at a time.由于 GIL(全局解释器锁)的性质,一次实际上只有一个线程在运行。 This is quite ineffective for a computationally bounded program that does not require any IO operations.这对于不需要任何 IO 操作的计算有界程序来说是非常无效的。 Context switching only served to cause my code to run slower.上下文切换只会导致我的代码运行速度变慢。 Sharing synchronization primitives between different processes is another problem entirely, and I am not sure how to deal with that yet...在不同进程之间共享同步原语完全是另一个问题,我不知道如何处理......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM