简体   繁体   中英

generate BZHI instruction with gcc

I'm trying to make gcc generate the bzhi instruction , part of BMI2, without using intrinsics, in order to create a portable code.

Given the outcome of bzhi , I expected that objective to be relatively accessible. The following SO answer provides a code example, simplified below :

unsigned bzhi32(unsigned value, int nbBits)
{
    return value & ((1u << nbBits) - 1);
}

clang has no problem generating bzhi instruction with it, while I haven't found any similar outcome for gcc so far : https://godbolt.org/g/jYrh8F

I was wondering if this was possible. This capability was at least requested , but not sure if it was completed. If it was, maybe there are just some subtle issues in the code snippet, for example with type or properties, which could be fixed to succeed this transformation with gcc .

edit : added u for constant, as suggested by @chux. It marginally changes the outcome for gcc , though it's still a 4-instructions function without bzhi .

This optimization is not implemented in gcc as of January 2018 (there is a feature request ). You can get the instruction by using intrinsics:

#include <x86intrin.h>

unsigned bzhi32(unsigned value, int nbBits) {
   return _bzhi_u32(value, nbBits);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM