简体   繁体   中英

function parameter to xmm0

I'm trying to move ALIGNED float array into xmm register

#define ALIGNED16 __declspec(align(16))

ALIGNED16 float vector1[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float vector2[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float result[4];

_add_vector(vector1, vector2, result);
....

_add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
  __asm
  {
    movaps xmm0, xmmword ptr [v1]
    movaps xmm1, xmmword ptr [v2]

    addps xmm0, xmm1

    movaps xmmword ptr [rvec], xmm0
  };
}

so when compiler trying to copy from v1 to xmm0 I have "read access violation" v1 was0xFFFFFFFF

But if I'm doing

__asm
  {
    movaps xmm0, xmmword ptr [v1]
  };

AFTER vector1 declaration then it works. why?

The issue is that v1 , v2 , and vrec are pointers to an array of floats. You need to dereference each of those pointers to get the actual arrays. Something like this may work:

void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec);

void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
    __asm
    {
        mov ecx, [v1]
        mov edx, [v2]
        mov eax, [rvec]

        movaps xmm0, xmmword ptr [ecx]
        movaps xmm1, xmmword ptr [edx]

        addps xmm0, xmm1

        movaps xmmword ptr [eax], xmm0
    };
}

In this case I use the caller saved registers of EAX , ECX , and EDX to do dereference the variables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM