简体   繁体   English

如何加速一个简单的方法(最好不要改变接口或数据结构)?

[英]How to speed-up a simple method (preferably without changing interfaces or data structures)?

I have some data structures: 我有一些数据结构:

  • all_unordered_m is a big vector containing all the strings I need (all different) all_unordered_m是一个包含我需要的所有字符串的大向量(全部不同)
  • ordered_m is a small vector containing the indexes of a subset of the strings (all different) in the former vector ordered_m是一个小向量,包含前一个向量中字符串子集(全部不同)的索引
  • position_m maps the indexes of objects from the first vector to their position in the second one. position_m将对象的索引从第一个向量映射到它们在第二个向量中的位置。

The string_after(index, reverse) method returns the string referenced by ordered_m after all_unordered_m[index] . string_after(index, reverse)方法返回all_unordered_m[index] 之后由ordered_m引用的字符串。

ordered_m is considered circular, and is explored in natural or reverse order depending on the second parameter. ordered_m被认为是循环的,并且根据第二个参数以自然或逆序进行探索。

The code is something like the following: 代码如下所示:

struct ordered_subset {
    // [...]

    std::vector<std::string>& all_unordered_m; // size = n >> 1
    std::vector<size_t> ordered_m;             // size << n
    std::tr1::unordered_map<size_t, size_t> position_m;  

    const std::string&
    string_after(size_t index, bool reverse) const
    {
        size_t pos = position_m.find(index)->second;
        if(reverse)
            pos = (pos == 0 ? orderd_m.size() - 1 : pos - 1);
        else
            pos = (pos == ordered.size() - 1 ? 0 : pos + 1);
        return all_unordered_m[ordered_m[pos]];
    }
};

Given that: 鉴于:

  • I do need all of the data-structures for other purposes; 我确实需要所有数据结构用于其他目的;
  • I cannot change them because I need to access the strings: 我无法更改它们因为我需要访问字符串:
    • by their id in the all_unordered_m; 通过他们在all_unordered_m中的id;
    • by their index inside the various ordered_m; 由他们的索引里面的各种ordered_m;
  • I need to know the position of a string (identified by it's position in the first vector) inside ordered_m vector; 我需要知道在ordered_m向量内的字符串的位置(由它在第一个向量中的位置标识);
  • I cannot change the string_after interface without changing most of the program. 我无法在不更改大部分程序的情况下更改string_after接口。

How can I speed up the string_after method that is called billions of times and is eating up about 10% of the execution time? 如何加快被称为数十亿次的string_after方法并且占用大约10%的执行时间?

EDIT: I've tried making position_m a vector instead of a unordered_map and using the following method to avoid jumps: 编辑:我已经尝试将position_m设为vector而不是unordered_map并使用以下方法来避免跳转:

string_after(size_t index, int direction) const
{
  return all_unordered_m[ordered_m[
      (ordered_m.size()+position_m[index]+direction)%ordered_m.size()]];
}

The change in position_m seems to be the most effective (I'm not sure that eliminating the branches made any difference, I'm tempted to say that the code is more compact but equally efficient with that regard). position_m的变化似乎是最有效的(我不确定消除分支有什么不同,我很想说代码更紧凑但在这方面同样有效)。

vector lookups are blazing fast. vector查找速度非常快。 size() calls and simple arithmetic are blazing fast. size()调用和简单算术都很快。 map lookups, in comparison, are as slow as a dead turtle with a block of concrete on his back. 相比之下, map查找与死龟一样慢,背面有一块混凝土。 I have often seen those become a bottleneck in otherwise simple code like this. 我经常看到那些成为这样的简单代码的瓶颈。

You could try unordered_map from TR1 or C++0x (a drop-in hashtable replacement of map ) instead and see if that makes a difference. 您可以尝试从TR1或C ++ 0x( map的插入哈希表替换)中的unordered_map ,看看是否有所作为。

Well, in such cases (a small function that is called often) every branch can be very expensive. 那么,在这种情况下(经常调用的小功能),每个分支都可能非常昂贵。 There are two things that come to mind. 有两件事情会浮现在脑海中。

  1. Could you leave out the reverse parameter and make it two separate methods? 你能省略reverse参数并使它成为两个独立的方法吗? This only makes sense if that doesn't simply push the if -statement to the calling code. 这只是有意义的,如果它不是简单地将if -statement推送到调用代码。
  2. Try the following for calculating pos : pos = (pos + 1) % ordered_m.size() (this is for the forward case). 尝试以下计算pospos = (pos + 1) % ordered_m.size() (这是针对前向情况)。 This only works if you are sure that pos never overflows when incrementing it. 这只有在你确定pos递增时它永远不会溢出时才有效。

In general, try to replace branches with arithmetic operations in such cases, this can give you substantial speedup. 通常,在这种情况下尝试用算术运算替换分支,这可以为您提供大量的加速。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM