简体   繁体   English

SSE内在函数:屏蔽浮点数并使用按位和?

[英]SSE intrinsics: masking a float and using bitwise and?

Basically the problem is related to x86 assembler where you have a number that you want to set to either zero or the number itself using an and . 基本上,问题是有关,你必须要设置为零或使用数字本身的一些 x86汇编and If you and that number with negative one you get back the number itself but if you and it with zero you get zero. 如果你and负一个号码 ,你回来本身的数量 ,但是,如果你and它与零你为零。

Now the problem I'm having with SSE instrinsics is that floats aren't the same in binary as doubles (or maybe I'm mistaken). 现在我对SSE instrinsics的问题是浮点数和二进制数不一样(或者我错了)。 Anyways here's the code, I've tried using all kinds of floats to mask the second and third numbers (127.0f and 99.0f respectively) but no luck. 无论如何这里的代码,我已经尝试使用各种浮动来掩盖第二和第三个数字(分别为127.0f和99.0f),但没有运气。

#include <xmmintrin.h>
#include <stdio.h>

void print_4_bit_num(const char * label, __m128 var)
{
    float *val = (float *) &var;
    printf("%s: %f %f %f %f\n",
       label, val[3], val[2], val[1], val[0]);
}
int main()
{
    __m128 v1 = _mm_set_ps(1.0f, 127.0f,  99.0f, 1.0f);
    __m128 v2 = _mm_set_ps(1.0f, 65535.0f, 127.0f, 0.0f);
    __m128 v = _mm_and_ps(v1, v2);

    print_4_bit_num("v1", v1);
    print_4_bit_num("v2", v2);
    print_4_bit_num("v ", v);

    return 0;
}

You need to use a bitwise (integer) mask when you AND , so to eg clear alternate values in a vector you might do something like this: AND时需要使用按位(整数)掩码,因此要清除向量中的替换值,可以执行以下操作:

__m128 v1 = _mm_set_ps(1.0f, 127.0f,  99.0f, 1.0f);
__m128 v2 = _mm_castsi128_ps(_mm_set_epi32(0, -1, 0, -1));
__m128 v = _mm_and_ps(v1, v2); // => v = { 0.0f, 127.0f, 0.0f, 1.0f }

You can cast any SSE vector to any SSE vector type of the same size (128 bit, or 256 bit), and you will get the exact same bits as before; 您可以将任何SSE向量转换为相同大小(128位或256位)的任何SSE向量类型,并且您将获得与以前完全相同的 ; there won't be any actual code. 没有任何实际的代码。 Obviously if you cast 4 float to 2 double you get nonsense, but for your case you cast float to some integer type, do the and, cast the result back. 显然,如果你将4浮点数转换为2倍,你会得到废话,但是对于你的情况你将浮点数转换为某种整数类型,然后执行和,将结果转换回来。

If you have SSE4.1 (which I bet you do), you should consider _mm_blendv_ps(a,b,mask) . 如果你有SSE4.1(我打赌你这样做),你应该考虑_mm_blendv_ps(a,b,mask) This only uses the sign bit of its mask argument and essentially implements the vectorised mask<0?b:a . 这只使用其mask参数的符号位,并基本上实现了矢量化mask<0?b:a

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM