在现代 x86 CPU 上将 C 中的两个整数相乘的正确 O(n log n) 时间复杂度？

Question

Consider the following C-Code on modern day Intel or AMD x86_64 hardware, where the datatype int has 32 bits :考虑现代 Intel 或 AMD x86_64 硬件上的以下 C 代码，其中数据类型int具有32 bits ：

// calc x times y for any two integers values of x and y (where the result can be stored by the int datatype)
int fastMult(int x, int y) {
    /*
     * Assuming the following operators take O(n) or O(1) time:
     * ==, <, >, &&, ||, &, |, >>, -, +, ?:
     */

    // x*0 == 0*y == 0
    if (y == 0 || x == 0) return 0;

    // (-x)(-y) == xn and (-x)y == x(-y)
    if (x < 0) return fastMult(-x, -y);

    int isNegative = y < 0; // x cannot be negative here

    // y*x is faster than x*y for bigger absolute y than x
    if (isNegative && x < -y || x < y) return fastMult(y, x);
    if (isNegative) y = -y; // handle y in a simpler way

    int res = fastMult(x, y >> 1); // called at max lb(y) times aka sizeof(y) times
    res = res + res; // one addition
    if (y & 1) res = x + res; // possible second addition

    // if y was negative, then the answer is negative
    return isNegative ? -res : res;
}

If we disregard the recursive step, the slowest operation in this code, besides the branches by the if s, would be the + operation.如果我们忽略递归步骤，那么除了if的分支之外，这段代码中最慢的操作就是+操作。 This operation can still be executed within a single clock cycle by the CPU.该操作仍可由 CPU 在单个时钟周期内执行。 So even if adding two 32 bit integer takes basically O(1) time in our hardware, we should still call it O(n) for bigint operations where a bigint is an int with n-bits .因此，即使在我们的硬件中添加两个32 bit integer 基本上需要O(1)时间，我们仍应将其称为O(n)用于bigint操作，其中bigint是具有n-bits的int 。

Having said this, the base case for this recursive implementation of multiplying two numbers stops at y == 0 .话虽如此，这种将两个数字相乘的递归实现的base case在y == 0处停止。 The function calls itself (besides the very first times when it might switch x and y and change some signs) at max 32 times , since it calls itself with the parameter of y as y >> 1 . function 调用自己（除了第一次可能会切换x和y并更改一些符号）最多32 times ，因为它调用自己时使用参数y作为y >> 1 。 This operation shifts the bits by one to the left, until all the bits are 0, which, for an 32 bit int , can be at max y >> 32 .此操作将位向左移动一位，直到所有位都为 0，对于32 bit int来说，最大值为y >> 32 。

Does this mean it is an O(n log n) algorithm for multiplying two integers together (since it would also work for bigints with the same time complexity).这是否意味着它是一个O(n log n)算法，用于将两个整数相乘（因为它也适用于具有相同时间复杂度的bigints ）。

Why O(n log n) ?为什么O(n log n) ？ We are doing multiple O(1) and O(n) calculations when calling this function once.一次调用此 function 时，我们正在进行多次O(1)和O(n)计算。 Since we call this function up to O(log n) times, we multiply O(log n) with O(n) and come to O(n log n) .由于我们将此 function 调用最多O(log n)次，因此我们将O(log n) O(n)并得出O(n log n) 。 At least thats my understanding of this.至少那是我对此的理解。

I am unsure about that , since all the usual methods of multiplying two integers take O(n*n) steps and there are only very few and complex algorithms which are faster than that, see: https://www.wikiwand.com/en/Multiplication_algorithm我对此不确定，因为所有将两个整数相乘的常用方法都需要O(n*n)步，并且只有极少数复杂的算法比这更快，请参阅： https://www.wikiwand.com/ zh/乘法算法

Simpler version of this code for unsigned integers:此代码的无符号整数的更简单版本：

// x * y for unsigned integers of x and y
int fastMult(int x, int y) {
    if (y == 0) return 0;

    int res = fastMult(x, y >> 1);
    res <<= 1; // O(1) or O(n), doesnt matter since O(n) is below this line and 2 times O(n) is still O(n)
    if (y & 1) res += x; // O(n)

    return res;
}

Answer 1

In analysis of theoretical multiplication algorithms, usually each digit operation is considered to be one operation, so res + res is considered to be n operations, not 1 .在分析理论乘法算法时，通常将每个数字运算视为一次运算，因此res + res被认为是n运算，而不是1 。 Ergo, your algorithm here is indeed O(n log n) , when compared to those other theoretical multiplication algorithms.因此，与其他理论乘法算法相比，您的算法确实是O(n log n) 。

It's also worth noting that hardware uses yet a third way of counting: what counts is transistors, so each "parallel transistor row" is an operation, so the naive carry-adder that we learn in school is n-rows, which is O(n) time.同样值得注意的是，硬件还使用了第三种计数方式：重要的是晶体管，所以每个“并行晶体管行”都是一个操作，所以我们在学校学习的简单进位加法器是 n 行，即O(n)时间。 Crafty algorithms can get addition down to O(log n) time.狡猾的算法可以将加法减少到O(log n)时间。

Answer 2

You're problem seems to be a confusion between n and x and y , which causes you to mistakenly think you're doing something log(n) times, when you are actually doing it n times.你的问题似乎是n和x和y之间的混淆，这导致你错误地认为你正在做某事 log(n) 次，而实际上你做了 n 次。 So to be clear所以要明确

n is the (maximum) number of digits/bits in the numbers you are multiplying. n是您要相乘的数字中的（最大）位数/位数。 This is the normal value of interest for talking about the complexity of arithmetic operation这是谈算术运算复杂度的正常兴趣值
x and y are the values you are mulitplying. x和y是您要多重计算的值。 So each of them has up to n digits.所以他们每个人最多有n数字。

So when you recurse with a shift y >> 1 you're halving y , not n .因此，当您递归移动y >> 1时，您将y减半，而不是n 。 This only reduces y by one bit (reducing n by one), so you end up recursing O(n) times, not O(log(n))这只会将y减少一位（将n减少一位），因此您最终递归 O(n) 次，而不是 O(log(n))

在现代 x86 CPU 上将 C 中的两个整数相乘的正确 O(n log n) 时间复杂度？

问题描述

2 个解决方案

解决方案1
1 2023-01-17 23:40:05

解决方案2
1 已采纳 2023-01-17 23:58:04

在现代 x86 CPU 上将 C 中的两个整数相乘的正确 O(n log n) 时间复杂度？

问题描述

2 个解决方案

解决方案1 1 2023-01-17 23:40:05

解决方案2 1 已采纳 2023-01-17 23:58:04

解决方案1
1 2023-01-17 23:40:05

解决方案2
1 已采纳 2023-01-17 23:58:04