简体   繁体   中英

SSE: convert __m128 and __m128i into two __m128d

Two related questions.

This is what my code needs to do with fairly large amount of data. It is done inside inner loops and the performance is important.

  1. Convert and array of __int32 into doubles (or convert __m128i into two __m128d).
  2. Convert and array of floats into doubles (or convert __m128 into two __m128d).

Basically, I need function with the following signatures:

void convert_int_to_double(__int32 const * input, double * output);
void convert_float_to_double(float const * input, double * output);

Input and output pointers are aligned and the number of elements is a multiple of 4. The main problem is how to quickly unpack __m128 into two __m128d.

The intrinsics _mm_cvtepi32_pd and _mm_cvtps_pd convert the values to double.

This should be the loop:

__m128i* base_addr = ...;
for( int i = 0; i < cnt; ++i )
{
    __m128i epi32 = _mm_load_si128( base_addr + i );
    __m128d v0 = _mm_cvtepi32_pd( epi32 );
    epi32 = _mm_srli_si128( epi32, 8 );
    __m128d v1 = _mm_cvtepi32_pd( epi32 );
    ....
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM