简体   繁体   English

如何仅使用位移和加法进行乘法和除法?

[英]How can I multiply and divide using only bit shifting and adding?

如何仅使用位移和加法进行乘法和除法?

To multiply in terms of adding and shifting you want to decompose one of the numbers by powers of two, like so:要在加法和移位方面相乘,您需要将数字之一分解为 2 的幂,如下所示:

21 * 5 = 10101_2 * 101_2             (Initial step)
       = 10101_2 * (1 * 2^2  +  0 * 2^1  +  1 * 2^0)
       = 10101_2 * 2^2 + 10101_2 * 2^0 
       = 10101_2 << 2 + 10101_2 << 0 (Decomposed)
       = 10101_2 * 4 + 10101_2 * 1
       = 10101_2 * 5
       = 21 * 5                      (Same as initial expression)

( _2 means base 2) _2表示基数 2)

As you can see, multiplication can be decomposed into adding and shifting and back again.如您所见,乘法可以分解为加法和移位,然后再返回。 This is also why multiplication takes longer than bit shifts or adding - it's O(n^2) rather than O(n) in the number of bits.这也是为什么乘法比位移或加法花费更长的时间 - 它的位数为 O(n^2) 而不是 O(n)。 Real computer systems (as opposed to theoretical computer systems) have a finite number of bits, so multiplication takes a constant multiple of time compared to addition and shifting.真实的计算机系统(与理论计算机系统相反)的位数是有限的,因此与加法和移位相比,乘法需要一个常数倍的时间。 If I recall correctly, modern processors, if pipelined properly, can do multiplication just about as fast as addition, by messing with the utilization of the ALUs (arithmetic units) in the processor.如果我没记错的话,现代处理器,如果流水线正确,可以通过扰乱处理器中 ALU(算术单元)的利用率来进行乘法运算。

The answer by Andrew Toulouse can be extended to division. Andrew Toulouse 的答案可以扩展到除法。

The division by integer constants is considered in details in the book "Hacker's Delight" by Henry S. Warren (ISBN 9780201914658). Henry S. Warren (ISBN 9780201914658) 所著的“Hacker's Delight”一书中详细讨论了除以整数常量。

The first idea for implementing division is to write the inverse value of the denominator in base two.实现除法的第一个想法是将分母的倒数写为以二为底的值。

Eg, 1/3 = (base-2) 0.0101 0101 0101 0101 0101 0101 0101 0101 .....例如, 1/3 = (base-2) 0.0101 0101 0101 0101 0101 0101 0101 0101 .....

So, a/3 = (a >> 2) + (a >> 4) + (a >> 6) + ... + (a >> 30) for 32-bit arithmetics.因此,对于 32 位算术, a/3 = (a >> 2) + (a >> 4) + (a >> 6) + ... + (a >> 30)

By combining the terms in an obvious manner we can reduce the number of operations:通过以一种明显的方式组合这些术语,我们可以减少操作的数量:

b = (a >> 2) + (a >> 4)

b += (b >> 4)

b += (b >> 8)

b += (b >> 16)

There are more exciting ways to calculate division and remainders.还有更多令人兴奋的方法来计算除法和余数。

EDIT1:编辑1:

If the OP means multiplication and division of arbitrary numbers, not the division by a constant number, then this thread might be of use: https://stackoverflow.com/a/12699549/1182653如果 OP 表示任意数字的乘法和除法,而不是除以常数,那么这个线程可能有用: https ://stackoverflow.com/a/12699549/1182653

EDIT2:编辑2:

One of the fastest ways to divide by integer constants is to exploit the modular arithmetics and Montgomery reduction: What's the fastest way to divide an integer by 3?除以整数常量的最快方法之一是利用模运算和蒙哥马利约简: 将整数除以 3 的最快方法是什么?

X * 2 = 1 bit shift left X * 2 = 1 位左移
X / 2 = 1 bit shift right X / 2 = 1 位右移
X * 3 = shift left 1 bit and then add X X * 3 = 左移 1 位,然后加上 X

x << k == x multiplied by 2 to the power of k
x >> k == x divided by 2 to the power of k

You can use these shifts to do any multiplication operation.您可以使用这些移位来执行任何乘法运算。 For example:例如:

x * 14 == x * 16 - x * 2 == (x << 4) - (x << 1)
x * 12 == x * 8 + x * 4 == (x << 3) + (x << 2)

To divide a number by a non-power of two, I'm not aware of any easy way, unless you want to implement some low-level logic, use other binary operations and use some form of iteration.要将一个数字除以非 2 的幂,我不知道有什么简单的方法,除非您想实现一些低级逻辑,使用其他二进制操作并使用某种形式的迭代。

  1. A left shift by 1 position is analogous to multiplying by 2. A right shift is analogous to dividing by 2.左移 1 个位置类似于乘以 2。右移类似于除以 2。
  2. You can add in a loop to multiply.您可以添加一个循环来相乘。 By picking the loop variable and the addition variable correctly, you can bound performance.通过正确选择循环变量和加法变量,您可以约束性能。 Once you've explored that, you should use Peasant Multiplication一旦你探索了这一点,你应该使用农民乘法

A procedure for dividing integers that uses shifts and adds can be derived in straightforward fashion from decimal longhand division as taught in elementary school.使用移位和加法的整数除法程序可以直接从小学教的十进制手写除法中推导出来。 The selection of each quotient digit is simplified, as the digit is either 0 and 1: if the current remainder is greater than or equal to the divisor, the least significant bit of the partial quotient is 1.每个商位的选择被简化,因为该位是 0 和 1:如果当前余数大于或等于除数,则部分商的最低有效位为 1。

Just as with decimal longhand division, the digits of the dividend are considered from most significant to least significant, one digit at a time.就像十进制的普通除法一样,被除数的位数从最高位到最低位,一次一位。 This is easily accomplished by a left shift in binary division.这很容易通过二进制除法的左移来实现。 Also, quotient bits are gathered by left shifting the current quotient bits by one position, then appending the new quotient bit.此外,通过将当前商位左移一个位置,然后附加新的商位来收集商位。

In a classical arrangement, these two left shifts are combined into left shifting of one register pair.在经典排列中,这两个左移组合成一个寄存器对的左移。 The upper half holds the current remainder, the lower half initial holds the dividend.上半部分持有当前余额,下半部分初始持有股息。 As the dividend bits are transferred to the remainder register by left shift, the unused least significant bits of the lower half are used to accumulate the quotient bits.当被除数位通过左移传送到余数寄存器时,下半部分未使用的最低有效位用于累加商位。

Below is x86 assembly language and C implementations of this algorithm.下面是该算法的 x86 汇编语言和 C 实现。 This particular variant of a shift & add division is sometimes referred to as the "non-performing" variant, as the subtraction of the divisor from the current remainder is not performed unless the remainder is greater than or equal to the divisor (Otto Spaniol, "Computer Arithmetic: Logic and Design." Chichester: Wiley 1981, p. 144).这种特殊的移位和加法变体有时被称为“非执行”变体,因为除非余数大于或等于除数,否则不会执行从当前余数中减去除数的操作(Otto Spaniol, “计算机算术:逻辑与设计。”奇切斯特:威利 1981 年,第 144 页)。 In C, there is no notion of the carry flag used by the assembly version in the register pair left shift.在 C 语言中,在寄存器对左移中没有汇编版本使用的进位标志的概念。 Instead, it is emulated, based on the observation that the result of an addition modulo 2 n can be smaller that either addend only if there was a carry out.相反,它是模拟的,基于这样的观察,即加法模 2 n的结果可以小于仅当存在进位时任一加法的结果。

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#define USE_ASM 0

#if USE_ASM
uint32_t bitwise_division (uint32_t dividend, uint32_t divisor)
{
    uint32_t quot;
    __asm {
        mov  eax, [dividend];// quot = dividend
        mov  ecx, [divisor]; // divisor
        mov  edx, 32;        // bits_left
        mov  ebx, 0;         // rem
    $div_loop:
        add  eax, eax;       // (rem:quot) << 1
        adc  ebx, ebx;       //  ...
        cmp  ebx, ecx;       // rem >= divisor ?
        jb  $quot_bit_is_0;  // if (rem < divisor)
    $quot_bit_is_1:          // 
        sub  ebx, ecx;       // rem = rem - divisor
        add  eax, 1;         // quot++
    $quot_bit_is_0:
        dec  edx;            // bits_left--
        jnz  $div_loop;      // while (bits_left)
        mov  [quot], eax;    // quot
    }            
    return quot;
}
#else
uint32_t bitwise_division (uint32_t dividend, uint32_t divisor)
{
    uint32_t quot, rem, t;
    int bits_left = CHAR_BIT * sizeof (uint32_t);

    quot = dividend;
    rem = 0;
    do {
            // (rem:quot) << 1
            t = quot;
            quot = quot + quot;
            rem = rem + rem + (quot < t);

            if (rem >= divisor) {
                rem = rem - divisor;
                quot = quot + 1;
            }
            bits_left--;
    } while (bits_left);
    return quot;
}
#endif

I translated the Python code to C. The example given had a minor flaw.我将 Python 代码翻译成 C。给出的示例有一个小缺陷。 If the dividend value that took up all the 32 bits, the shift would fail.如果被除数占用了所有 32 位,则移位将失败。 I just used 64-bit variables internally to work around the problem:我只是在内部使用 64 位变量来解决这个问题:

int No_divide(int nDivisor, int nDividend, int *nRemainder)
{
    int nQuotient = 0;
    int nPos = -1;
    unsigned long long ullDivisor = nDivisor;
    unsigned long long ullDividend = nDividend;

    while (ullDivisor <  ullDividend)
    {
        ullDivisor <<= 1;
        nPos ++;
    }

    ullDivisor >>= 1;

    while (nPos > -1)
    {
        if (ullDividend >= ullDivisor)
        {
            nQuotient += (1 << nPos);
            ullDividend -= ullDivisor;
        }

        ullDivisor >>= 1;
        nPos -= 1;
    }

    *nRemainder = (int) ullDividend;

    return nQuotient;
}

Take two numbers, lets say 9 and 10, write them as binary - 1001 and 1010.取两个数字,比如说 9 和 10,将它们写成二进制 - 1001 和 1010。

Start with a result, R, of 0.从结果 R 为 0 开始。

Take one of the numbers, 1010 in this case, we'll call it A, and shift it right by one bit, if you shift out a one, add the first number, we'll call it B, to R.取其中一个数字,在这种情况下为 1010,我们将其称为 A,然后将其右移一位,如果移出一个,则将第一个数字(我们将其称为 B)添加到 R。

Now shift B left by one bit and repeat until all bits have been shifted out of A.现在将 B 左移一位并重复,直到所有位都移出 A。

It's easier to see what's going on if you see it written out, this is the example:如果你看到它被写出来,就更容易看到发生了什么,这就是例子:

      0
   0000      0
  10010      1
 000000      0
1001000      1
 ------
1011010

Taken from here .取自这里

This is only for division:这仅适用于除法:

int add(int a, int b) {
        int partialSum, carry;
        do {
            partialSum = a ^ b;
            carry = (a & b) << 1;
            a = partialSum;
            b = carry;
        } while (carry != 0);
        return partialSum;
}

int subtract(int a, int b) {
    return add(a, add(~b, 1));
}

int division(int dividend, int divisor) {
        boolean negative = false;
        if ((dividend & (1 << 31)) == (1 << 31)) { // Check for signed bit
            negative = !negative;
            dividend = add(~dividend, 1);  // Negation
        }
        if ((divisor & (1 << 31)) == (1 << 31)) {
            negative = !negative;
            divisor = add(~divisor, 1);  // Negation
        }
        int quotient = 0;
        long r;
        for (int i = 30; i >= 0; i = subtract(i, 1)) {
            r = (divisor << i);
           // Left shift divisor until it's smaller than dividend
            if (r < Integer.MAX_VALUE && r >= 0) { // Avoid cases where comparison between long and int doesn't make sense
                if (r <= dividend) { 
                    quotient |= (1 << i);    
                    dividend = subtract(dividend, (int) r);
                }
            }
        }
        if (negative) {
            quotient = add(~quotient, 1);
        }
        return quotient;
}

This should work for multiplication:这应该适用于乘法:

.data

.text
.globl  main

main:

# $4 * $5 = $2

    addi $4, $0, 0x9
    addi $5, $0, 0x6

    add  $2, $0, $0 # initialize product to zero

Loop:   
    beq  $5, $0, Exit # if multiplier is 0,terminate loop
    andi $3, $5, 1 # mask out the 0th bit in multiplier
    beq  $3, $0, Shift # if the bit is 0, skip add
    addu $2, $2, $4 # add (shifted) multiplicand to product

Shift: 
    sll $4, $4, 1 # shift up the multiplicand 1 bit
    srl $5, $5, 1 # shift down the multiplier 1 bit
    j Loop # go for next  

Exit: #


EXIT: 
li $v0,10
syscall

The below method is the implementation of binary divide considering both numbers are positive.下面的方法是考虑到两个数字都是正数的二进制除法的实现。 If subtraction is a concern we can implement that as well using binary operators.如果减法是一个问题,我们也可以使用二元运算符来实现它。

Code代码

-(int)binaryDivide:(int)numerator with:(int)denominator
{
    if (numerator == 0 || denominator == 1) {
        return numerator;
    }

    if (denominator == 0) {

        #ifdef DEBUG
            NSAssert(denominator==0, @"denominator should be greater then 0");
        #endif
        return INFINITY;
    }

    // if (numerator <0) {
    //     numerator = abs(numerator);
    // }

    int maxBitDenom = [self getMaxBit:denominator];
    int maxBitNumerator = [self getMaxBit:numerator];
    int msbNumber = [self getMSB:maxBitDenom ofNumber:numerator];

    int qoutient = 0;

    int subResult = 0;

    int remainingBits = maxBitNumerator-maxBitDenom;

    if (msbNumber >= denominator) {
        qoutient |=1;
        subResult = msbNumber - denominator;
    }
    else {
        subResult = msbNumber;
    }

    while (remainingBits > 0) {
        int msbBit = (numerator & (1 << (remainingBits-1)))>0?1:0;
        subResult = (subResult << 1) | msbBit;
        if(subResult >= denominator) {
            subResult = subResult - denominator;
            qoutient= (qoutient << 1) | 1;
        }
        else{
            qoutient = qoutient << 1;
        }
        remainingBits--;

    }
    return qoutient;
}

-(int)getMaxBit:(int)inputNumber
{
    int maxBit = 0;
    BOOL isMaxBitSet = NO;
    for (int i=0; i<sizeof(inputNumber)*8; i++) {
        if (inputNumber & (1<<i)) {
            maxBit = i;
            isMaxBitSet=YES;
        }
    }
    if (isMaxBitSet) {
        maxBit+=1;
    }
    return maxBit;
}


-(int)getMSB:(int)bits ofNumber:(int)number
{
    int numbeMaxBit = [self getMaxBit:number];
    return number >> (numbeMaxBit - bits);
}

For multiplication:对于乘法:

-(int)multiplyNumber:(int)num1 withNumber:(int)num2
{
    int mulResult = 0;
    int ithBit;

    BOOL isNegativeSign = (num1<0 && num2>0) || (num1>0 && num2<0);
    num1 = abs(num1);
    num2 = abs(num2);


    for (int i=0; i<sizeof(num2)*8; i++)
    {
        ithBit =  num2 & (1<<i);
        if (ithBit>0) {
            mulResult += (num1 << i);
        }

    }

    if (isNegativeSign) {
        mulResult =  ((~mulResult)+1);
    }

    return mulResult;
}

it is basically multiplying and dividing with the base power 2它基本上是乘以和除以基本幂 2

shift left = x * 2 ^ y左移 = x * 2 ^ y

shift right = x / 2 ^ y右移 = x / 2 ^ y

shl eax,2 = 2 * 2 ^ 2 = 8 shl eax,2 = 2 * 2 ^ 2 = 8

shr eax,3 = 2 / 2 ^ 3 = 1/4 shr eax,3 = 2 / 2 ^ 3 = 1/4

For anyone interested in a 16-bit x86 solution, there is a piece of code by JasonKnight here 1 (he also includes a signed multiply piece, which I haven't tested).对于任何对 16 位 x86 解决方案感兴趣的人,这里有一段JasonKnight的代码1 (他还包括一个有符号的乘法部分,我还没有测试过)。 However, that code has issues with large inputs, where the "add bx,bx" part would overflow.但是,该代码存在较大输入的问题,其中“add bx,bx”部分会溢出。

The fixed version:固定版本:

softwareMultiply:
;    INPUT  CX,BX
;   OUTPUT  DX:AX - 32 bits
; CLOBBERS  BX,CX,DI
    xor   ax,ax     ; cheap way to zero a reg
    mov   dx,ax     ; 1 clock faster than xor
    mov   di,cx
    or    di,bx     ; cheap way to test for zero on both regs
    jz    @done
    mov   di,ax     ; DI used for reg,reg adc
@loop:
    shr   cx,1      ; divide by two, bottom bit moved to carry flag
    jnc   @skipAddToResult
    add   ax,bx
    adc   dx,di     ; reg,reg is faster than reg,imm16
@skipAddToResult:
    add   bx,bx     ; faster than shift or mul
    adc   di,di
    or    cx,cx     ; fast zero check
    jnz   @loop
@done:
    ret

Or the same in GCC inline assembly:或在 GCC 内联汇编中相同:

asm("mov $0,%%ax\n\t"
    "mov $0,%%dx\n\t"
    "mov %%cx,%%di\n\t"
    "or %%bx,%%di\n\t"
    "jz done\n\t"
    "mov %%ax,%%di\n\t"
    "loop:\n\t"
    "shr $1,%%cx\n\t"
    "jnc skipAddToResult\n\t"
    "add %%bx,%%ax\n\t"
    "adc %%di,%%dx\n\t"
    "skipAddToResult:\n\t"
    "add %%bx,%%bx\n\t"
    "adc %%di,%%di\n\t"
    "or %%cx,%%cx\n\t"
    "jnz loop\n\t"
    "done:\n\t"
    : "=d" (dx), "=a" (ax)
    : "b" (bx), "c" (cx)
    : "ecx", "edi"
);

Try this.尝试这个。 https://gist.github.com/swguru/5219592 https://gist.github.com/swguru/5219592

import sys
# implement divide operation without using built-in divide operator
def divAndMod_slow(y,x, debug=0):
    r = 0
    while y >= x:
            r += 1
            y -= x
    return r,y 


# implement divide operation without using built-in divide operator
def divAndMod(y,x, debug=0):

    ## find the highest position of positive bit of the ratio
    pos = -1
    while y >= x:
            pos += 1
            x <<= 1
    x >>= 1
    if debug: print "y=%d, x=%d, pos=%d" % (y,x,pos)

    if pos == -1:
            return 0, y

    r = 0
    while pos >= 0:
            if y >= x:
                    r += (1 << pos)                        
                    y -= x                
            if debug: print "y=%d, x=%d, r=%d, pos=%d" % (y,x,r,pos)

            x >>= 1
            pos -= 1

    return r, y


if __name__ =="__main__":
    if len(sys.argv) == 3:
        y = int(sys.argv[1])
        x = int(sys.argv[2])
     else:
            y = 313271356
            x = 7

print "=== Slow Version ...."
res = divAndMod_slow( y, x)
print "%d = %d * %d + %d" % (y, x, res[0], res[1])

print "=== Fast Version ...."
res = divAndMod( y, x, debug=1)
print "%d = %d * %d + %d" % (y, x, res[0], res[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM