简体   繁体   English

C ++映射查找性能与PHP数组查找性能

[英]C++ map lookup performance vs PHP array lookup performance

I can't understand the following and I'm hoping someone can shed some light on it for me: 我无法理解以下内容,希望有人能为我提供一些帮助:

In C++ if I create a vector of test data containing 2M different bits of text (testdata) then create a map using these strings as the index values, then look up all the values, like this: 在C ++中,如果我创建一个包含2M个不同文本位(testdata)的测试数据向量,然后使用这些字符串作为索引值创建一个映射,然后查找所有值,如下所示:

 //Create test data
for(int f=0; f<loopvalue; f++)
{   
    stringstream convertToString;
    convertToString << f;
    string strf = convertToString.str();
    testdata[f] = "test" + strf;
}

    time_t startTimeSeconds = time(NULL);

   for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
   for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup

   time_t endTimeSeconds = time(NULL);
   cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;

It takes 10 seconds. 这需要10秒钟。

If I do seemingly at least the same in PHP: 如果我似乎至少在PHP中做过同样的事情:

<?php
$starttime = time();
$loopvalue = 2000000;

//fill array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $testarray[$filler] = $f;
}

//look up array
for($f=0; $f<$loopvalue; $f++)
{
    $filler = "test" . $f;
    $result = $testarray[$filler];
}

$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>

...it takes only 3 seconds. ...仅需3秒。

Given that PHP is written in C does anyone know how PHP achieves this much faster text index lookup? 既然PHP是用C编写的,那么有谁知道PHP如何实现这种更快的文本索引查找?

Thanks C 谢谢C

Your loops are not absolutely equivalent algorithms. 您的循环不是绝对等价的算法。 Note that in the C++ version you have 请注意,在C ++版本中,

  1. testmap[ testdata[f] ] - this is actually a lookup + insert testmap [testdata [f]]-这实际上是一个查找+插入
  2. testmap[ testdata[f] ] - 2 lookups testmap [testdata [f]]-2个查询

In the PHP versions you just have insert in the first loop and lookup in the second one. 在PHP版本中,您只需在第一个循环中插入,然后在第二个循环中查找。

PHP is interpreted - generally if you code is faster in PHP, check the code first ! 解释了PHP-通常,如果您使用PHP编写代码更快,请先检查代码! ;-) ;-)

I suspect you benchmark the wrong things. 我怀疑您基准测试错误。 Anyway, I used your code (had to make some assumptions on your data types) and here are the results from my machine: 无论如何,我使用了您的代码(必须对您的数据类型进行一些假设),这是我的计算机的结果:

PHP: Time taken 2 seconds. PHP:花费2秒。

C++ (using std::map): Time taken 3 seconds. C ++(使用std :: map):耗时3秒。

C++ (using std::tr1::unordered_map): Time taken 1 seconds. C ++(使用std :: tr1 :: unordered_map):耗时1秒。

C++ compiled with 用C ++编译

g++ -03

Here is my test C++ code: 这是我的测试C ++代码:

#include <map>
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
#include <tr1/unordered_map>


int main(){
    const int loopvalue=2000000;
    std::vector<std::string> testdata(loopvalue);
    std::tr1::unordered_map<std::string, int> testmap;
    std::string result;
    for(int f=0; f<loopvalue; f++)
    {   
        std::stringstream convertToString;
        convertToString << f;
        std::string strf = convertToString.str();
        testdata[f] = "test" + strf;
    }

    time_t startTimeSeconds = time(NULL);

    for(int f=0; f<loopvalue; f++) testmap[ testdata[f] ] = f; //Write to map
    for(int f=0; f<loopvalue; f++) result = testmap[ testdata[f] ]; //Lookup

    time_t endTimeSeconds = time(NULL);
    std::cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << std::endl;
}

Conclusion: 结论:

You tested unoptimized C++ code, probably even compiled with VC++, which by default has a bounds check in std::vector::operator[] when compiled in debug mode. 您测试了未优化的C ++代码,甚至可能是使用VC ++编译的,默认情况下,在调试模式下编译时,std :: vector :: operator []会默认检查边界。

There still is a difference of PHP to the optimised C++ code, when we use std::map, because of the difference in lookup complexity (see n0rd's answer), but C++ is faster when you use a Hashmap. 当我们使用std :: map时,PHP与优化的C ++代码仍然存在差异,这是因为查找复杂性有所不同(请参阅n0rd的答案),但是使用Hashmap时C ++的速度更快。

根据另一个问题 ,PHP中的关联数组被实现为哈希表 ,其平均搜索复杂度为O(1),而C ++中的std :: map是具有搜索复杂度为O(log n)的二叉树。慢点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM