NEON vs Intel SSE - 某些操作的等效性

Question

我在查明几个英特尔SSE操作的NEON等效性时遇到了一些麻烦。 看起来NEON不能同时处理整个Q寄存器（128位值数据类型）。 我没有在arm_neon.h头文件或NEON内在函数参考中找到任何内容。

我想做的是以下内容：

// Intel SSE
// shift the entire 128 bit value with 2 bytes to the right; this is done 
// without sign extension by shifting in zeros
__m128i val = _mm_srli_si128(vector_of_8_s16, 2);
// insert the least significant 16 bits of "some_16_bit_val"
// the whole thing in this case, into the selected 16 bit 
// integer of vector "val"(the 16 bit element with index 7 in this case)
val = _mm_insert_epi16(val, some_16_bit_val, 7);

我已经看过NEON提供的转换操作，但找不到相同的方法来做上述事情（我对NEON没有多少经验）。 是否有可能做到以上（我想这是我只是不知道如何）？ 任何指针都非常感激。

Answer 1

你想要VEXT指令。 你的例子看起来像：

int16x8_t val = vextq_s16(vector_of_8_s16, another_vector_s16, 1);

在此之后，位的0-111 val将包含位16-127 vector_of_8_s16 ，位的112-127 val将包含位0-15 another_vector_s16 。

NEON vs Intel SSE - 某些操作的等效性

问题描述

1 个解决方案

解决方案1
6 已采纳 2011-08-27 14:53:31

NEON vs Intel SSE - 某些操作的等效性

问题描述

1 个解决方案

解决方案1 6 已采纳 2011-08-27 14:53:31

解决方案1
6 已采纳 2011-08-27 14:53:31