简体   繁体   English

我如何在考虑性能的情况下重构此代码?

[英]How could I refactor this code with performance in mind?

I have a method where performance is really important (I know premature optimization is the root of all evil. I know I should and I did profile my code. In this application every tenth of a second I save is a big win.) This method uses different heuristics to generate and return elements. 我有一种方法,其中性能非常重要(我知道过早的优化是所有邪恶的根源。我知道我应该并且我确实对我的代码进行了描述。在这个应用程序中,每十分之一秒我保存是一个很大的胜利。)这种方法使用不同的启发式方法来生成和返回元素。 The heuristics are used sequentially: the first heuristic is used until it can no longer return elements, then the second heuristic is used until it can no longer return elements and so on until all heuristics have been used. 顺序使用启发式:使用第一个启发式直到它不再返回元素,然后使用第二个启发式直到它不再返回元素等等,直到使用了所有启发式算法。 On each call of the method I use a switch to move to the right heuristic. 在方法的每次调用中,我使用开关移动到右侧启发式。 This is ugly, but work well. 这很难看,但效果很好。 Here is some pseudo code 这是一些伪代码

class MyClass
{
private:
   unsigned int m_step;
public:
   MyClass() : m_step(0) {};

   Elem GetElem()
   {
      // This switch statement will be optimized as a jump table by the compiler.
      // Note that there is no break statments between the cases.
      switch (m_step)
      {
      case 0:
         if (UseHeuristic1())
         {
            m_step = 1; // Heuristic one is special it will never provide more than one element.
            return theElem;
         }

         m_step = 1;

      case 1:
         DoSomeOneTimeInitialisationForHeuristic2();
         m_step = 2;

      case 2:
         if (UseHeuristic2())
         {
            return theElem;
         }

         m_step = 3;

      case 3:
         if (UseHeuristic3())
         {
            return theElem;
         }
         m_step = 4; // But the method should not be called again
      }

      return someErrorCode;
   };
}

As I said, this works and it's efficient since at each call, the execution jumps right where it should. 正如我所说,这是有效的,并且它是有效的,因为在每次调用时,执行跳转到它应该的位置。 If a heuristic can't provide an element, m_step is incremented (so the next time we don't try this heuristic again) and because there is no break statement, the next heuristic is tried. 如果启发式无法提供元素,则m_step递增(因此下次我们不再尝试此启发式)并且因为没有break语句,所以尝试下一个启发式。 Also note that some steps (like step 1) never return an element, but are one time initialization for the next heuristic. 另请注意,某些步骤(如步骤1)永远不会返回元素,而是为下一个启发式进行一次初始化。

The reason initializations are not all done upfront is that they might never be needed. 初始化并非全部预先完成的原因是它们可能永远不需要。 It is always possible (and common) for GetElem to not get called again after it returned an element, even if there are still elements it could return. GetElem在返回元素后不会再次被调用是可能的(也是常见的),即使它仍然可以返回元素。

While this is an efficient implementation, I find it really ugly. 虽然这是一个有效的实现,但我发现它真的很难看。 The case statement is a hack; 案件陈述是一个黑客; using it without break is also hackish; 使用它不间断也是hackish; the method gets really long, even if each heuristic is encapsulated in its own method. 即使每个启发式方法都封装在自己的方法中,该方法也会变得非常长。

How should I refactor this code so it's more readable and elegant while keeping it as efficient as possible? 我应该如何重构这些代码,使其更具可读性和优雅性,同时尽可能保持高效?

Wrap each heuristic in an iterator. 在迭代器中包装每个启发式。 Initialize it completely on the first call to hasNext() . 在第一次调用hasNext()时完全初始化它。 Then collect all iterators in a list and use a super-iterator to iterate through all of them: 然后收集列表中的所有迭代器并使用超级迭代器迭代所有迭代器:

boolean hasNext () {
    if (list.isEmpty()) return false;

    if (list.get(0).hasNext()) return true;

    while (!list.isEmpty()) {
        list.remove (0);
        if (list.get(0).hasNext()) return true;
    }
    return false;
}
Object next () {
    return list.get (0).next ();
}

Note: In this case, a linked list might be a tiny bit faster than an ArrayList but you should still check this. 注意:在这种情况下,链表可能比ArrayList快一点,但您仍应检查这一点。

[EDIT] Changed "turn each" into "wrap each" to make my intentions more clear. [编辑]将“将每个”改为“包装每个”以使我的意图更加清晰。

I don't think your code is so bad, but if you're doing this kind of thing a lot, and you want to hide the mechanisms so that the logic is clearer, you could look at Simon Tatham's coroutine macros . 我不认为你的代码是如此糟糕,但是如果你做了很多这样的事情,并且你想隐藏机制以便逻辑更清晰,你可以看看Simon Tatham的协程宏 They're intended for C (using static variables) rather than C++ (using member variables), but it's trivial to change that. 它们用于C(使用静态变量)而不是C ++(使用成员变量),但改变它是微不足道的。

The result should look something like this: 结果应如下所示:

Elem GetElem()
{
  crBegin;

  if (UseHeuristic1())
  {
     crReturn(theElem);
  }

  DoSomeOneTimeInitialisationForHeuristic2();

  while (UseHeuristic2())
  {
     crReturn(theElem);
  }

  while (UseHeuristic3())
  {
     crReturn(theElem);
  }

  crFinish;
  return someErrorCode;
}

It looks like there really isn't much to optimize in this code - probably most of the optimization can be done in the UseHeuristic functions. 看起来在这段代码中没有太多优化 - 可能大多数优化都可以在UseHeuristic函数中完成。 What's in them? 它们里面有什么?

To my mind if you do not need to modify this code much, eg to add new heuristics then document it well and don't touch it. 在我看来,如果你不需要修改这个代码,例如添加新的启发式,那么记录它并且不要触摸它。

However if new heuristics are added and removed and you think that this is an error prone process then you should consider refactoring it. 但是,如果添加和删除新的启发式方法并且您认为这是一个容易出错的过程,那么您应该考虑重构它。 The obvious choice for this would be to introduce the State design pattern. 对此的明显选择是引入州设计模式。 This will replace your switch statement with polymorphism which might slow things down but you would have to profile both to be sure. 这将用多态性替换你的switch语句,这可能会减慢速度,但你必须对两者进行分析才能确定。

You can turn the control flow inside-out. 您可以从内到外控制流量。

template <class Callback>  // a callback that returns true when it's done
void Walk(Callback fn)
{
    if (UseHeuristic1()) {
        if (fn(theElem))
            return;
    }
    DoSomeOneTimeInitialisationForHeuristic2();
    while (UseHeuristic2()) {
        if (fn(theElem))
            return;
    }
    while (UseHeuristic3()) {
        if (fn(theElem))
            return;
    }
}

This might earn you a few nanoseconds if the switch dispatch and the return statements are throwing the CPU off its stride, and the recipient is inlineable. 如果switch调度和return语句使CPU脱离其步幅,并且收件人是可内联的,这可能会为您带来几纳秒的时间。

Of course, this kind of optimization is futile if the heuristics themselves are nontrivial. 当然,如果启发式算法本身是非常重要的,那么这种优化是徒劳的。 And much depends on what the caller looks like. 而且很大程度上取决于呼叫者的样子。

That's micro optimization, but there is no need to set m_elem value when you are not returning from GetElem. 这是微优化,但是当你没有从GetElem返回时,不需要设置m_elem值。 See code below. 见下面的代码。

Larger optimization definitely need simplifying control flow (less jumps, less returns, less tests, less function calls), because as soon as a jump is done processor cache are emptied (well some processors have branch prediction, but it's no silver bullet). 更大的优化肯定需要简化控制流程 (减少跳转,减少返回,减少测试,减少函数调用),因为一旦完成跳转,处理器缓存就会被清空(有些处理器有分支预测,但它不是银弹)。 You can give a try at solutions proposed by Aaron or Jason, and there is others (for instance you can implement several get_elem functions annd call them through a function pointer, but I'm quite sure it'll be slower). 您可以尝试Aaron或Jason提出的解决方案,还有其他解决方案(例如,您可以实现几个get_elem函数并通过函数指针调用它们,但我确信它会慢一些)。

If the problem allow it, it can also be efficient to compute several elements at once in heuristics and use some cache, or to make it truly parallel with some thread computing elements and this one merely a customer waiting for results... no way to say more without some details on the context. 如果问题允许,在启发式中一次计算多个元素并使用一些缓存,或者使它与某些线程计算元素真正并行,这也只是客户等待结果,也可以高效...没办法在没有上下文细节的情况下说更多。

class MyClass
{
private:
   unsigned int m_step;
public:
   MyClass() : m_step(0) {};

   Elem GetElem()
   {
      // This switch statement will be optimized as a jump table by the compiler.
      // Note that there is no break statments between the cases.
      switch (m_step)
      {
      case 0:
         if (UseHeuristic1())
         {
            m_step = 1; // Heuristic one is special it will never provide more than one element.
            return theElem;
         }

      case 1:
         DoSomeOneTimeInitialisationForHeuristic2();
         m_step = 2;

      case 2:
         if (UseHeuristic2())
         {
            return theElem;
         }

      case 3:
         m_step = 4;

      case 4:
         if (UseHeuristic3())
         {
            return theElem;
         }
         m_step = 5; // But the method should not be called again
      }

      return someErrorCode;
   };
}

What you really can do here is replacing conditional with State pattern. 你真正可以做的是用State模式替换条件。

http://en.wikipedia.org/wiki/State_pattern http://en.wikipedia.org/wiki/State_pattern

May be it would be less performant because of the virtual method call, maybe it would be better performant because of less state maintaining code, but the code would be definitely much clearer and maintainable, as always with patterns. 由于虚拟方法调用可能会降低性能,也许由于较少的状态维护代码会更好,但代码肯定会更清晰和可维护,就像模式一样。

What could improve performance, is elimination of DoSomeOneTimeInitialisationForHeuristic2(); 什么可以提高性能,是消除DoSomeOneTimeInitialisationForHeuristic2(); with separate state between. 两者之间的分离状态。 1 and 2. 1和2。

Since each heuristic is represented by a function with an identical signature, you can make a table of function pointers and walk through it. 由于每个启发式都由具有相同签名的函数表示,因此您可以创建一个函数指针表并遍历它。

class MyClass 
{ 
private: 
   typedef bool heuristic_function();
   typedef heuristic_function * heuristic_function_ptr;
   static heuristic_function_ptr heuristic_table[4];
   unsigned int m_step; 
public: 
   MyClass() : m_step(0) {}; 

   Elem GetElem() 
   { 
      while (m_step < sizeof(heuristic_table)/sizeof(heuristic_table[0]))
      {
         if (heuristic_table[m_step]())
         {
            return theElem;
         }
         ++m_step;
      }

      return someErrorCode; 
   }; 
}; 

MyClass::heuristic_function_ptr MyClass::heuristic_table[4] = { UseHeuristic1, DoSomeOneTimeInitialisationForHeuristic2, UseHeuristic2, UseHeuristic3 };

If the element code you are processing can be converted to an integral value, then you can construct a table of function pointers and index based on the element. 如果您正在处理的元素代码可以转换为整数值,那么您可以基于该元素构造函数指针和索引的表。 The table would have one entry for each 'handled' element, and one for each known but unhandled element. 该表将为每个“已处理”元素提供一个条目,并为每个已知但未处理的元素提供一个条目。 For unknown elements, do a quick check before indexing the function pointer table. 对于未知元素,请在索引函数指针表之前进行快速检查。

Calling the element-processing function is fast. 调用元素处理函数很快。

Here's working sample code: 这是工作示例代码:

#include <cstdlib>
#include <iostream>
using namespace std;

typedef void (*ElementHandlerFn)(void);

void ProcessElement0()
{
    cout << "Element 0" << endl;
}

void ProcessElement1()
{
    cout << "Element 1" << endl;
}
void ProcessElement2()
{
    cout << "Element 2" << endl;
}

void ProcessElement3()
{
    cout << "Element 3" << endl;
}

void ProcessElement7()
{
    cout << "Element 7" << endl;
}

void ProcessUnhandledElement()
{
    cout << "> Unhandled Element <" << endl;
}




int main()
{
    // construct a table of function pointers, one for each possible element (even unhandled elements)
    // note: i am assuming that there are 10 possible elements -- 0, 1, 2 ... 9 --
    // and that 5 of them (0, 1, 2, 3, 7) are 'handled'.

    static const size_t MaxElement = 9;
    ElementHandlerFn handlers[] = 
    {
        ProcessElement0,
        ProcessElement1,
        ProcessElement2,
        ProcessElement3,
        ProcessUnhandledElement,
        ProcessUnhandledElement,
        ProcessUnhandledElement,
        ProcessElement7,
        ProcessUnhandledElement,
        ProcessUnhandledElement
    };

    // mock up some elements to simulate input, including 'invalid' elements like 12
    int testElements [] = {0, 1, 2, 3, 7, 4, 9, 12, 3, 3, 2, 7, 8 };
    size_t numTestElements = sizeof(testElements)/sizeof(testElements[0]);

    // process each test element
    for( size_t ix = 0; ix < numTestElements; ++ix )
    {
        // for some robustness...
        if( testElements[ix] > MaxElement )
            cout << "Invalid Input!" << endl;
        // otherwise process normally
        else
            handlers[testElements[ix]]();

    }

    return 0;
}

If it ain't broke don't fix it. 如果没有破坏,请不要修理它。

It looks pretty efficient as is. 它看起来非常高效。 It doesn't look hard to understand either. 看起来也不难理解。 Adding iterators etc. is probably going to make it harder to understand. 添加迭代器等可能会让人更难理解。

You are probably better off doing 你可能会做得更好

  1. Performance analysis. 绩效分析。 Is time really spent in this procedure at all, or is most of it in the functions that it calls? 是时候真的花在这个程序上了,还是大部分时间都在它调用的函数中? I can't see any significant time being spent here. 我在这里看不到任何重要的时间。
  2. More unit tests, to prevent someone from breaking it if they have to modify it. 更多单元测试,以防止有人在必须修改它时破坏它。
  3. Additional comments in the code. 代码中的其他注释。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM