[英]Two bits of two __m128i to four bits of one __m128i -SSE
我有兩個變量__m128i(a和b),我只對bit63和bit127感興趣。
最后,我想要一個變量__m128i(c),其變量a和b的四位分別位於第31位,第63位,第95位和第127位。
結論(偽代碼):
c.bit31 = a.bit63
c.bit63 = a.bit127
c.bit95 = b.bit63
c.bit127 = b.bit127
如果我使用存儲(浮點數組),轉換為int array [4],最后轉換為load(int數組),我會浪費很多時間。
我不知道如何使用內部操作(SSEx x <= 4.2)做到這一點。
您可以像這樣僅使用SSE2
__m128i t1 = _mm_shuffle_epi32(a,0xd0); //0xd0 = 3100 in base 4
__m128i t2 = _mm_shuffle_epi32(b,0xd0); //0xd0 = 3100 in base 4
__m128i t3 = _mm_unpackhi_epi32(t1,t2);
__m128i t4 = _mm_shuffle_epi32(t3,0xd8); //0xd8 = 3120 in base 4
__m128i t5 = _mm_and_si128(t4,_mm_set1_epi32(0x8000000));
這是一個有效的例子
#include <x86intrin.h>
#include <stdio.h>
int main(void) {
__m128i a = _mm_setr_epi32(1,-2,3,-4);
__m128i b = _mm_setr_epi32(5,-6,7,-8);
__m128i t1 = _mm_shuffle_epi32(a,0xd0); //0xd0 = 3100 in base 4
__m128i t2 = _mm_shuffle_epi32(b,0xd0); //0xd0 = 3100 in base 4
__m128i t3 = _mm_unpackhi_epi32(t1,t2);
__m128i t4 = _mm_shuffle_epi32(t3,0xd8); //0xd8 = 3120 in base 4
__m128i t5 = _mm_and_si128(t4,_mm_set1_epi32(0x8000000));
int x[4];
_mm_store_si128((__m128i*)x,t5);
for(int i=0; i<4; i++) printf("%x ", x[i]); printf("\n");
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.