使用 numpy ndarray 的 joblib 並行循環的性能

Question

我正在使用 numpy 在 python 中進行一些統計計算。 到目前為止，我當前的實現還沒有並行化。 所以我正在研究 python joblib Parallel 以進行簡單的循環並行化。

我的非並行代碼部分如下所示：

def calcRADMatInt( i, j , RADMat, pdfMu, pdfSigma):
  if i==j:
    RADMat[i, j] = 0.0
  else:
    RADMat[i, j] = calcRAD( pdfMu[i], np.squeeze( pdfSigma[i]), pdfMu[j], np.squeeze( pdfSigma[j]) )
    RADMat[j, i] = RADMat[i,j]

def caldRADMat(....):

....
....
  RADMat = np.zeros( (numLandmark, numLandmark) )

  for i in range( 0, numLandmark):
    for j in range( i, numLandmark)):
      calcRADMatInt( i, j, RADMat, pdfMu, pdfSigma)

....
....

我試圖像這樣並行化它：

def caldRADMat(....):
....
....

  RADMat = np.zeros( (numLandmark, numLandmark) )

  for i in range( 0, numLandmark):
    Parallel(n_jobs=8)(delayed(calcRADMatInt)( i, j, RADMat, pdfMu, pdfSigma) for j in    range( i, numLandmark))

....
....

但是，生成的並行代碼運行速度明顯低於非並行版本。

所以我想我的實際問題是：我是否正確使用 joblib Parallel？ 這是並行計算 numpy ndarray 元素的正確方法嗎？

Answer 1

您可以在並行任務中包含兩個 for 循環，如下所示：

def calcRADMatInt( i, j , RADMat, pdfMu, pdfSigma):
  if i==j:
    RADMat[i, j] = 0.0
  else:
    RADMat[i, j] = calcRAD( pdfMu[i], np.squeeze( pdfSigma[i]), pdfMu[j], np.squeeze( pdfSigma[j]) )
    RADMat[j, i] = RADMat[i,j]

def caldRADMat(....):
....
....

  RADMat = np.zeros( (numLandmark, numLandmark) )

  Parallel(n_jobs=-1)(delayed(calcRADMatInt)
  (i, j, RADMat, pdfMu, pdfSigma)
  for i in range(0,numLandmark)
  for j in range( i, numLandmark))

....
....

如果您在循環內調用 Parallel 任務，就像您所做的那樣，您的計算是次優的。

希望能幫助到你！

最好的祝福，

使用 numpy ndarray 的 joblib 並行循環的性能

問題描述

1 個解決方案

解決方案1
0 2020-03-31 13:38:32

使用 numpy ndarray 的 joblib 並行循環的性能

問題描述

1 個解決方案

解決方案1 0 2020-03-31 13:38:32

解決方案1
0 2020-03-31 13:38:32