为什么此算法在python中比在C ++中工作得这么快？

Question

我正在阅读Robert Sedgewick撰写的“ C ++中的算法”，并且得到了以下练习：通过将另一种编程语言中的算法减半来重写具有路径压缩功能的该快速联合。

该算法用于检查是否连接了两个对象，例如对于像1-2、2-3和1-3这样的条目，前两个条目创建了新的连接，而在第三个条目中已经连接了1和3，因为可以达到3从1：1-2-3开始，因此第三个条目将不需要创建新的连接。

抱歉，如果算法描述无法理解，英语不是我的母语。

所以这是算法本身：

#include <iostream>
#include <ctime>

using namespace std;

static const int N {100000};

int main()
{
    srand(time(NULL));

    int i; 
    int j; 

    int id[N];  
    int sz[N]; // Stores tree sizes

    int Ncount{}; // Counts the numbeer of new connections 
    int Mcount{}; // Counts the number of all attempted connections

    for (i = 0; i < N; i++)
    {
        id[i] = i;
        sz[i] = 1;
    }

    while (Ncount < N - 1)
    {
        i = rand() % N;
        j = rand() % N;

        for (; i != id[i]; i = id[i])
            id[i] = id[id[i]];

        for (; j != id[j]; j = id[j])
            id[j] = id[id[j]];

        Mcount++;

        if (i == j) // Checks if i and j are connected 
            continue;

        if (sz[i] < sz[j]) // Smaller tree will be 
                           // connected to a bigger one 
        {
            id[i] = j;
            sz[j] += sz[i];
        }
        else
        {
            id[j] = i;
            sz[i] += sz[j];
        }

        Ncount++;
    }

    cout << "Mcount: " << Mcount << endl;
    cout << "Ncount: " << Ncount << endl;

    return 0;
}

我知道一点点python，因此在本练习中选择了它。 这是得到的：

import random

N = 100000

idList = list(range(0, N))
sz = [1] * N

Ncount = 0
Mcount = 0

while Ncount < N - 1:

    i = random.randrange(0, N)
    j = random.randrange(0, N)

    while i is not idList[i]:
        idList[i] = idList[idList[i]]
        i = idList[i]

    while j is not idList[j]:
        idList[j] = idList[idList[j]]
        j = idList[j]

    Mcount += 1

    if i is j:
        continue

    if sz[i] < sz[j]:
        idList[i] = j
        sz[j] += sz[i]
    else:
        idList[j] = i
        sz[i] += sz[j]

    Ncount += 1

print("Mcount: ", Mcount)
print("Ncount: ", Ncount)

但是我偶然发现了一个有趣的细微差别：当我将N设置为100000或更高版本时，C ++版本似乎比python慢得多-完成python中算法的任务大约需要10秒钟，而C ++版本却在做这么慢，我只需要关闭它。

所以我的问题是：原因是什么？ 是否由于rand（）％N和random.randrange（0，N）的不同而发生？ 还是我做错了什么？

如果有人可以向我解释这一点，我将非常感谢！

Answer 1

这些代码做不同的事情。

您必须将python中的数字与==进行比较。

>>> x=100000
>>> y=100000
>>> x is y
False

可能还有其他问题，尚未检查。 您是否比较了应用程序的结果？

Answer 2

如上所述，这些代码并不等效，尤其是在使用is vs == 。

查看以下Pyhton代码：

while i is not idList[i]:
    idList[i] = idList[idList[i]]
    i = idList[i]

评估为0或1次 。 为什么？。 因为如果while计算结果为True一日一次，然后i = idList[i]使条件True在第二通，因为现在i是肯定的数，其is在idList

等效的c++

for (; i != id[i]; i = id[i])
     id[i] = id[id[i]];

这里的代码是根据相等性而不是存在性进行检查，并且它运行的次数不固定为0或1

所以是的...使用is vs ==会产生很大的差异，因为在Python中，您正在测试实例相等性 并被包含在其中 ，而不是从等效意义上测试简单相等性 。

上面的Python和C ++的比较就像比较苹果和梨。

注意：这个问题的简短答案是： Python版本运行速度快得多，因为它的性能比C ++版本差很多

为什么此算法在python中比在C ++中工作得这么快？

问题描述

2 个解决方案

解决方案1
3 2016-02-06 18:26:35

解决方案2
2 2016-02-07 06:10:48

为什么此算法在python中比在C ++中工作得这么快？

问题描述

2 个解决方案

解决方案1 3 2016-02-06 18:26:35

解决方案2 2 2016-02-07 06:10:48

解决方案1
3 2016-02-06 18:26:35

解决方案2
2 2016-02-07 06:10:48