OpenCL的。矩陣乘法繞過某些工作項

Question

嘗試在OpenCL中實現矩陣乘法時，我嘗試編寫自己的方法。 但是似乎某些工作項目的工作似乎被其他工作項目所覆蓋，我真的不知道該如何處理。

我真正確定的是問題出在OpenCL程序內。

我的主機代碼是C / C ++。

程序生成並提供輸出（錯誤，但程序成功退出）。

這是我的方法：

__kernel void matrixMultiplication(
         __global double* matrix1,
         __global double* matrix2,
         __global double* output,
         const unsigned int ROWS_M1, // ROWS_M1 = 3
         const unsigned int ROWS_M1, // COLS_M1 = 2
         const unsigned int ROWS_M2, // ROWS_M2 = 2
         const unsigned int ROWS_M2, // COLS_M2 = 4
         const unsigned int ROWS_M3, // ROWS_M3 = 3
         const unsigned int ROWS_M3) { // COLS_M3 = 4

    int i = get_global_id(0);
    int j = get_global_id(1);

    // for each value in the matrix1 (for each work-item)
    // and for each value in the "jth" row in the second matrix...
    // multiply the values and then add them according to the right offset.

    for(int k =0; k < COLS_M2; k++){
        int offsetM1 = (i*COLS_M1)+j;
        int offsetM2 = (j*COLS_M2)+k;
        int offsetM3 = (i*COLS_M3)+k;

        //output[i][k] += matrix1[i][j]*matrix2[j][k];
        output[offsetM3] += matrix1[offsetM1]*matrix2[offsetM2];
    }

}

在代碼中指定了為每個“ const unsigned int”設置的值。

矩陣的值為：

矩陣1：

1 2
3 4
5 6

矩陣2：

2 3 4 5
6 7 8 9

給定輸出：

12 14 16 18
24 28 32 36
36 42 48 54

所需的輸出：

14 17 20 23
30 37 44 51
46 57 68 79

Answer 1

我認為您在索引編制方面做錯了。 的*offsetM3*應該等於*i\\*COLS_M3+j*時， *offsetM1*應該等於*i\\*COLS_M1+k* ，和*offsetM2*至*k\\*COLS_M2+j* 。

將矩陣寫在紙上並進行數學運算，然后將矩陣寫到內存中存在的數組中，然后相乘，然后將看到索引模式。 記住，每個線程（工作項）都是新矩陣的一個元素。 如果通過for循環更改新矩陣的索引，則不會遵循一個矩陣元素的邏輯一個工作項，如果您希望這樣做，則應考慮另一個邏輯。 希望這可以幫助

Answer 2

TL; 博士

問題是我的循環。 不要那樣做很糟糕

既然我已經完成了大學的學業，並且所有的東西我都可以花點時間為自己的問題寫一個正確的答案，以便其他偶然發現同一問題的人都能找到答案。

在我編寫循環的過程中，有一種情況是，各種工作項會與其他工作項重疊，從而在不同的執行測試之間產生不同的結果。 基本上是一個互斥問題，您可以使用信號量輕松解決。

解決方案是在計算特定偏移量時使用不同的方法重寫整個循環。

這是為可能會覺得有趣或有用的任何人解決了我的問題的來源

#pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void multiplyMatrix(                                  
   __global double* matrix1,                                   
   __global double* matrix2,                                   
   __global double* output,                                    
   const unsigned int ROWS_M1,                                 
   const unsigned int COLS_M1,                                          
   const unsigned int ROWS_M2,                                          
   const unsigned int COLS_M2,                                          
   const unsigned int ROWS_M3,                                          
   const unsigned int COLS_M3) {                                        

   int i = get_global_id(0);                                            
   int j = get_global_id(1);                                            
   double aux = 0.0;                                                    
   int offsetM1;                                                        
   int offsetM2;                                                        
   int offsetM3;                                                        
    // foreach value in the matrix1 (each process in the workgroup) 
    // and foreach row in the second matrix multiply the values 
    // adding to the according calculating offest/position      
    for(int k=0; k < COLS_M2; k++){                                 

        offsetM1 = (i*COLS_M1)+j;                                
        offsetM2 = (j*COLS_M2)+k;                                
        offsetM3 = (i*COLS_M3)+k;                                

        //output[i][k] += matrix1[i][j]*matrix2[j][k]              
        aux = 0.0;                                                 
        aux = (matrix1[offsetM1]*matrix2[offsetM2])  +aux;   

    }                                                            
    output[offsetM3] =aux;                                                                
}

OpenCL的。矩陣乘法繞過某些工作項

問題描述

2 個解決方案

解決方案1
0 2018-01-21 22:36:27

解決方案2
0 已采納 2018-10-01 18:25:20

TL; 博士

OpenCL的。 矩陣乘法繞過某些工作項

問題描述

2 個解決方案

解決方案1 0 2018-01-21 22:36:27

解決方案2 0 已采納 2018-10-01 18:25:20

TL; 博士

OpenCL的。矩陣乘法繞過某些工作項

解決方案1
0 2018-01-21 22:36:27

解決方案2
0 已采納 2018-10-01 18:25:20