简体   繁体   中英

Matrix Multiplication Using SSE Error __m128 to *float conversion?

I'm trying to program matrix multiplication using SSE Intrinsic. I'm not sure if my code is correct and I can't compiler it either because I get an error:

Error   1   error C2440: 'type cast' : cannot convert from 'float' to '__m128 *

Can someone double check my program so that my matrix multiplication is correct? Note also that this is for a square matrix.

Here is my code.

void Intrinsics (float * matrix_a, float * matrix_b, float * matrix_result, const int num_row, const int num_col) {
    __declspec(align(16)) float * a = matrix_a;
    __declspec(align(16)) float * b = matrix_b;
    __declspec(align(16)) float * c = matrix_result;

    for(int i = 0; i < num_row; ++i)
    {
       for(int j = 0; j < num_col; ++j)
        {
            __m128 *m3 = (__m128*)a[i];     // The error is here.
            __m128 *m4 = (__m128*)b[j];
            float* res;
            *(c + (j * num_col + i)) = 0;
          for(int k = 0; k < num_col; k += 4)
            {
                __m128 m5 = _mm_mul_ps(*m3,*m4);
                res = (float*)&m5;
                *(c + (j * num_col + i)) += res[0]+res[1]+res[2]+res[3];
                m3++;
                m4++;
            }
        }
    }
}

I assume that it's __m128 *m3 = (__m128*)a[i]; and the line after that that produce the error. You are trying to typecast a float to a pointer to __m128 which compiler is right to complain about.

I don't know the details of the intended algorithm. Assuming that the intent is to access four floats a[i]..a[i+3] as a single __m128, you need something like this:

__m128 *m3 = (__m128*)&a[i];
__m128 *m4 = (__m128*)&b[j];

or equivalent:

__m128 *m3 = (__m128*)(a + i);
__m128 *m4 = (__m128*)(b + j);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM