简体   繁体   中英

Algorithm for copying N bits at arbitrary position from one int to another

An interesting problem I've been pondering the past few days is how to copy one integer's bits into another integer at a given position in the destination integer. So, for example, given the destination integer 0xdeadbeef and the source integer 0xabcd , the idea would be to get a result of 0xabcdbeef (given a destination position of 16 bits) or 0xdeabcdef (given a destination position of 8 bits).

With the arbitrary limitation of avoiding conditionals or loops (allowing myself to use just mathematical/bitwise operations), I developed the following function (C++)

int setbits(int destination, int source, int at, int numbits)
{
    int ones = ((1<<(numbits))-1)<<at;
    return (ones|destination)^((~source<<at)&ones);
}

where at is the place where the source bits should be copied into the destination number (0-31) and numbits is the number of bits being copied from source (1-32). As far as I can tell, this algorithm works for all values except for at = 0 and numbits = 32 (the case when the entire destination integer is being overwritten by the source integer) due to the fact that 1<<32 results in 1 (since the shift wraps around) as opposed to 0.

My questions are:

  1. How is this normally done? Are there any particularly notable algorithms used (by notable, I'm asking if there are any particularly efficient tricks that can be used to do this)?
  2. Does my algorithm work as well as I think it does (that is, works for all values except at = 0 and numbits = 32)?
  3. Related to 1), is there any way to do this only using mathematical/bitwise operators? The algorithm for all values is trivial using conditions or loops, so I'm not interested in that.

Algorithm design is usually a weak point for me, so I have no idea whether or not my algorithm is 'as good as it gets' when only using mathematical/bitwise operations. Thanks

I don't think it can be done more efficient unless you write assembler.

You can improve the readability and solve your overflow problem changing some little things:

int setbits2(int destination, int source, int at, int numbits)
{
    // int mask = ((1LL<<numbits)-1)<<at; // 1st aproach
    int mask = ((~0u)>>(sizeof(int)*8-numbits))<<at; // 2nd aproach
    return (destination&~mask)|((source<<at)&mask);
}

More efficient assembler version (VC++):

// 3rd aproach
#define INT_SIZE 32;
int setbits3(int destination, int source, int at, int numbits)
{ __asm {
    mov ecx, INT_SIZE
    sub ecx, numbits
    or  eax, -1
    shr eax, cl
    mov ecx, at
    shl eax, cl // mask == eax
    mov ebx, eax
    not eax
    and eax, destination
    mov edx, source
    shl edx, cl
    and edx, ebx
    or  eax, edx
}}
  • 1st aproach: Slower on 32bit architecture
  • 2nd aproach: (~0u) and (sizeof(int)*8) are calculated at compile time, so they don't charge any cost.
  • 3rd aproach: You save 3 ops (memory accesses) writing it in assembler but you will need to write ifdefs if you want to make it portable.

I don't think it's the case that 1<<32 wraps (otherwise, why doesn't 2<<31 also wrap?), instead I think that internally modulus 32 is applied to the second operator, so that 1<<32 is actually equivalent to 1<<0. Also, consider changing the parameters types from "int" to "unsigned int". To get the value of "ones" without running into the "1<<32" problem, you can do this:

unsigned int ones = (0xffffffff >> (32-numbits)) << at;

I don't believe there are any "standard" methods for this kind of operation. I'm sure there are other ways of using bitwise operators in different ways to achieve the same outcome, but your algorithm is as good as any.

Having said that, though, maintainability and documentation is also important. Your function would benefit from the algorithm being documented with a comment, especially to explain how you use the bitwise XOR -- which is clever, but not easy to understand at first glance.

It pretty good: I tried this alternate version, but yours was about 30% faster in testing:

    int[] bits = new int[] {0,1,3,7,15,31,63,127,255,511,1023
        ,2047,4095,8192,16383,32767,65535,131071,262143,524287
        ,1048575,2097151,4194303,8388607,16777215,33554431,67108863
        ,134217727,268435455,536870911,1073741823,2147483647,-1};

    public int setbits2(int destination, int source, int at, int numbits)
    {
        int ones = bits[numbits + at] & ~bits[at];
        return (destination & ~ones) | ((source << at) & ones);
    }

Generalized GRB-fnieto form...

template <typename T>
T setbits4(T destination, T source, int at, int numbits)
{
    T mask = (((T)-1)>>(sizeof(T)*8-numbits))<<at; // 4th aproach
    return (destination&~mask)|((source<<at)&mask);
}

I think it hardly could be more efficient. Moreover, bitwise operations are much faster than any algebraic operations.

// SET OF FUNCTIONS

//##########    BIT - BIT   
template < typename var_t >     inline  var_t       bit_V           ( uint8_t b )                                               { return var_t(1) << b; }           // Same as usual macros, but this one converts de variable type, so that you can use it in uint8_t to uint64_t for example.
template < typename var_t >     inline  var_t       bit_get         ( const var_t & V , uint8_t b )                             { return V &    bit_V<var_t>(b); }  // Can be used as bool or to get the mask of the bit.
template < typename var_t >     inline  var_t       bit_settled     ( const var_t & V , uint8_t b )                             { return V |    bit_V<var_t>(b); } 
template < typename var_t >     inline  var_t       bit_unsettled   ( const var_t & V , uint8_t b )                             { return V &~   bit_V<var_t>(b); } 
template < typename var_t >     inline  void        bit_set         ( var_t & V , uint8_t b )                                   {        V |=   bit_V<var_t>(b); } 
template < typename var_t >     inline  void        bit_unset       ( var_t & V , uint8_t b )                                   {        V &=  ~bit_V<var_t>(b); } 
template < typename var_t >     inline  void        bit_mod         ( var_t & V , uint8_t b , bool set )                        { if (set) bit_set(V,b); else bit_unset(V,b); } //  compiler will optimize depending on if 'set' is constant.
template < typename var_t >     inline  void        bit_cpy         ( var_t & V , const var_t & S , uint8_t b )                 { var_t t = bit_get(S,b); V |= t; V &~ t; } 
template < typename var_t >     inline  void        bit_cpy         ( var_t & V , const var_t & S , uint8_t bV , uint8_t bM )   { bit_mod(V,bV,bit_get(S,bM)); } 
/// MULTIPLE BITS:
template < typename var_t >     inline  void        bits_set        ( var_t & V , const var_t & S )                                     { V |=  S;  }
template < typename var_t >     inline  void        bits_unset      ( var_t & V , const var_t & S )                                     { V &= ~S; }
/// ONLY WITH UNSIGNED INTS:    'at' parameters are refered to the less significant bit (lsb), starting at 0 index ( a byte would have 7 to 0 bits ).
template < typename var_t >             void        bits_cpy        ( var_t & V , const var_t & S , uint8_t numBits , uint8_t atlsb = 0  )  {   //  I choosed not to make this one inline
                                                                        var_t       mask = (~var_t(0)>>(sizeof(var_t)*8 - numBits))<<atlsb;
                                                                        bits_unset  ( V , mask ) ; 
                                                                        bits_set    ( V , S & mask  ) ; 
                                                                    }
template < typename var_t >             void        bits_cpy        ( var_t & V , const var_t & S , uint8_t numBits , uint8_t atVlsb , uint8_t atSlsb ) {   //  I choosed not to make this one inline
                                                                        bits_cpy ( V , (atVlsb>atSlsb)?(S<<(atVlsb-atSlsb)):(S>>(atSlsb-atVlsb)) , numBits , atVlsb ) ; 
                                                                    }
template < typename var_t >             var_t       bits_cpyd       ( const var_t & V , const var_t & S , uint8_t numBits , uint8_t atlsb = 0  )    { 
                                                                        var_t r = V;
                                                                        bits_cpy    (r,S,numBits,atlsb); 
                                                                        return r;
                                                                    }
template < typename var_t >             var_t       bits_cpyd       ( const var_t & V , const var_t & S , uint8_t numBits , uint8_t atVlsb , uint8_t atSlsb )   {   
                                                                        var_t r = V;
                                                                        bits_cpy    (r,S,numBits,atVlsb,atSlsb); 
                                                                        return r;
                                                                    }

//##########    BIT - BIT   - EXAMPLE OF USE WITH THE MOST RELEVANT FUNCTIONS:
// I used them inside functions, to get/set two variables inside a class, u and c

                void                u_set               ( edrfu_t u )       {           bits_cpy    <uint32_t>  ( CFG       , u         , 8     , 2             ,0              );}
                edrfu_t             u_get               ()                  { return    bits_cpyd   <uint32_t>  ( 0         , CFG       , 8     , 0             ,2              );}
                void                c_set               ( edrfc_t c )       {           bits_cpy    <uint32_t>  ( CFG       , c         , 2     );}
                edrfc_t             c_get               ()                  { return    bits_cpyd   <uint32_t>  ( 0         , CFG       , 2     );}

uint32_t copy_bits(uint32_t dst, uint32_t src, uint8_t end_bit, uint8_t start_bit)

{

uint32_t left, right, mask, result;

if (end_bit <= start_bit)
{
    printf("%s: end_bit:%d shall be greater than start_bit: %d\n", __FUNCTION__, end_bit, start_bit);
    return 0;
}

left   = ~0; // All Fs
right  = ~0;
result = 0;
left  >>= ((sizeof(uint32_t)*8) - end_bit); // Create left half of mask
right <<= start_bit; // Create right half of mask
mask   =  (left & right); // Now you have the mask for specific bits
result = (dst & (~mask)) | (src & (mask));
printf("%s, dst: 0x%08x, src: 0x%08x, end_bit: %d, start_bit: %d, mask: 0x%08x, result: 0x%08x\n",
      __FUNCTION__, dst, src, end_bit, start_bit, mask, result);

return result;

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM