I'm playing by writing with some AVX-optimised functions. I want to use a standard unsigned integer argument type like uint64_t
instead of the 256-bit unsigned integer the AVX instructions require ( __m256i_u
). Is it possible to do the cast?
No, Intel's intrinsics API doesn't allow actual C casts between integer and vector types, I think not even between uint64_t
and __m64
(a 64-bit MMX vector).
Use _mm256_set...
and _mm_cvtsi128_si64
. (And _mm256_castsi256_si128
when necessary) to get value(s) into or the low value out of a vector, with a broadcast or a list of operands. See Intel's intrinsics guide for cvt
and _mm256_set
intrinsics; Google the intrinsic name for examples of using it, especially with site:stackoverflow.com
. You might want to limit your intrinsics guide searches to SSE4, not AVX2, to limit the number of intrinsics to wade through. And so the parameter list is shorter; it's more immediately visible that _mm_set_epi32()
takes 4 int
args, for a total of 128 bits.
See also What are the names and meanings of the intrinsic vector element types, like epi64x or pi32? re: the existence of epi64x
vs. epi64
(MMX to XMM vs. 64-bit integer)
Also, use __m256i
, not GCC's internal __m256i_u
unaligned type. Use __m256i v = __mm256_loadu_si256((const __m256i*) ptr);
to do an unaligned load.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.