简体   繁体   English

比较总订单的两个__m128i值

[英]Compare two __m128i values for total order

I need a way to compare values of type __m128i in C++ for a total order between any values of type __m128i . 我需要一种方法来比较类型的值__m128i在C ++类型的任何值之间的总顺序__m128i The type of order doesn't matter as long as it establishes a total order between all values of type __m128i . 只要在__m128i类型的所有值之间建立总顺序,订单类型就__m128i Hence the comparison might be less-than between 128-bit integers or something else entirely as long as is provides a total order. 因此,只要提供总订单,比较可能小于128位整数或其他整数。

I tried using the < operator, but that didn't return a bool , but instead seems to compare the vector components of __m128i (ie SIMD): 我尝试使用<运算符,但是没有返回bool ,而是似乎比较了__m128i (即SIMD)的向量组件:

#include <emmintrin.h>

inline bool isLessThan(__m128i a, __m128i b) noexcept {
     // error: cannot convert '__vector(2) long int' to 'bool' in return
     return a < b;
}

Another possibility would be to use memcmp / strcmp or similar, but this would very likely not be optimal. 另一种可能性是使用memcmp / strcmp或类似的,但这很可能不是最佳的。 Targeting modern Intel x86-64 CPUs with at least SSE4.2 and AVX2, are there any intrinsics / instructions I could use for such comparisons? 针对至少具有SSE4.2和AVX2的现代Intel x86-64 CPU,我可以使用任何内在/指令进行此类比较吗? How to do it? 怎么做?

PS: Similar questions have been asked for checking equality but not for ordering: PS:已经要求类似的问题检查相等性,但不是为了订购:

Here you go. 干得好。

inline bool isLessThan( __m128i a, __m128i b )
{
    /* Compare 8-bit lanes for ( a < b ), store the bits in the low 16 bits of the
       scalar value: */
    const int less = _mm_movemask_epi8( _mm_cmplt_epi8( a, b ) );

    /* Compare 8-bit lanes for ( a > b ), store the bits in the low 16 bits of the
       scalar value: */
    const int greater = _mm_movemask_epi8( _mm_cmpgt_epi8( a, b ) );

    /* It's counter-intuitive, but this scalar comparison does the right thing.
       Essentially, integer comparison searches for the most significant bit that
       differs... */
    return less > greater;
}

The order is less than ideal 'coz pcmpgtb treats these bytes as signed integers, but you said it's not important for your use case. 顺序不太理想'coz pcmpgtb将这些字节视为有符号整数,但是你说它对你的用例并不重要。


Update: here's a slightly slower version with uint128_t sort order. 更新:这是一个稍微慢一点的uint128_t排序顺序版本。

// True if a < b, for unsigned 128 bit integers
inline bool cmplt_u128( __m128i a, __m128i b )
{
    // Flip the sign bits in both arguments.
    // Transforms 0 into -128 = minimum for signed bytes,
    // 0xFF into +127 = maximum for signed bytes
    const __m128i signBits = _mm_set1_epi8( (char)0x80 );
    a = _mm_xor_si128( a, signBits );
    b = _mm_xor_si128( b, signBits );

    // Now the signed byte comparisons will give the correct order
    const int less = _mm_movemask_epi8( _mm_cmplt_epi8( a, b ) );
    const int greater = _mm_movemask_epi8( _mm_cmpgt_epi8( a, b ) );
    return less > greater;
}

We build unsigned compares out of signed by range-shifting the unsigned inputs to signed (by flipping the high bit = subtract 128). 我们构建无符号比较,通过将无符号输入的范围转换为有符号(通过翻转高位=减去128)进行范围转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM