简体   繁体   English

引用指针参数的C ++优化

[英]C++ optimization of reference-to-pointer argument

I'm wondering with functions like the following, whether to use a temporary variable (p): 我想知道以下函数是否使用临时变量(p):

void parse_foo(const char*& p_in_out,
               foo& out) {
    const char* p = p_in_out;

    /* Parse, p gets incremented etc. */

    p_in_out = p;
}

or can I just use the original argument and expect it to be optimized similarly to the above anyway? 还是我可以只使用原始参数,并希望它与上述方法类似地被优化? It seems like there should be such an optimization, but I've seen the above done in a few places such as Mozilla's code, with vague comments about "avoiding aliasing". 似乎应该进行这样的优化,但是我已经在Mozilla的代码之类的一些地方看到了上述优化,并对“避免别名”的注释模糊不清。

所有好的答案,但是如果您担心性能优化,那么实际的解析将花费几乎所有时间,因此指针别名可能会“陷入困境”。

The variant with a temporary variable could be faster since it doesn't imply that every change to the pointer is reflected back to the argument and the compiler has better chances on generating faster code. 带有临时变量的变体可能更快,因为这并不意味着对指针的每次更改都会反映回参数,并且编译器更有可能生成更快的代码。 However the right way to test this is to compile and look at the disassembly. 但是,测试此问题的正确方法是编译并查看反汇编。

Meanwhile this has noting to do with avoiding aliasing. 同时,这与避免混叠有关。 In fact, the variant with a temporary variant does employ aliasing - now you have two pointers into the same array and that's exactly what aliasing is. 实际上, 带有临时变量的变量确实使用了别名-现在您在同一数组中有两个指针,这正是别名。

I would use a temporary if there is a possibility that the function is transactional. 如果该功能具有事务性,我将使用临时方法。

ie the function succeeds or fails completely (no middle ground). 即功能成功或完全失败(没有中间立场)。
In this case I would use a temp to maintain state while the function executes and only assign back to the in_out parameter when the function completes successfully. 在这种情况下,我将使用临时函数在函数执行时保持状态,并且仅在函数成功完成后才分配回in_out参数。

If the function exits prematurely (ie via exception) then we have two situations: 如果函数过早退出(即通过异常退出),那么我们有两种情况:

  • With a temporary (the external pointer is unchanged) 有临时的(外部指针不变)
  • Using the parameter directly the external state is modified to reflect position. 直接使用参数可以修改外部状态以反映位置。

I don't see any optimization advantages to either method. 我看不到任何一种方法的优化优势。

Yes, you should assign it to a local that you mark restrict ( __restrict in MSVC). 是的,您应该将其分配给标记为restrict (在MSVC中为__restrict )的本地。

The reason for this is that if the compiler cannot be absolutely sure that nothing else in the scope points at p_in_out , it cannot store the contents under the pointer in a local register. 这样做的原因是,如果编译器不能完全确定范围中的其他指针是否指向p_in_out ,则它无法将指针下的内容存储在本地寄存器中。 It must read the data back every time you write to any other char * in the same scope . 每次您写入同一作用域中的任何其他char *它都必须回读数据。 This is not an issue of whether it is a "smart" compiler or not; 这是否是“智能”编译器都不成问题; it is a consequence of correctness requirements. 这是正确性要求的结果。

By writing char* __restrict p you promise the compiler that no other pointer in the same scope points to the same address as p . 通过编写char* __restrict p可以向编译器保证, 没有其他指针与p指向相同的地址 Without this guarantee, the value of *p can change any time any other pointer is written to, or it may change the contents of some other pointer every time *p is written to. 没有此保证, *p的值可以在每次写入任何其他指针时更改,或者每次写入*p时它可能更改其他指针的内容。 Thus, the compiler has to write out every assignment to *p back to memory immediately, and it has to read them back after every time another pointer is written through. 因此,编译器必须*p每个分配立即写回内存,并且在每次写入另一个指针后必须将其读回。

So, guaranteeing the compiler that this cannot happen — that it can load *p exactly once and assume no other pointer affects it — can be an improvement in performance. 因此,保证编译器不会发生这种情况 -它可以只加载一次*p并且假定没有其他指针会影响它-可以改善性能。 Exactly how much depends on the particular compiler and situation: on processors subject to a load-hit-store penalty, it's massive; 究竟多少取决于特定的编译器和情况:在处理器上,要承受按负载加载的惩罚,这是巨大的; on most x86 CPUs, it's modest. 在大多数x86 CPU上,它是适中的。

The reason to prefer a pointer to a reference here is simply that a pointer can be marked restrict and a reference cannot. 在这里偏爱使用指针指向引用的原因仅仅是因为可以将指针标记为restrict而引用则不能。 That's just the way C++ is. 这就是C ++的方式。

You can try it both ways and measure the results to see which is really faster. 您可以尝试两种方法,然后测量结果以查看哪种方法确实更快。 And if you're curious, I've written in depth on restrict and the load-hit-store elsewhere . 而且,如果您感到好奇, 我会在其他地方深入讨论restrict和负载均衡存储

addendum : after writing the above I realize that the people at Moz were more worried about the reference itself being aliased -- that is, that something else might point at the same address where const char *p is stored, rather than the char to which p points. 附录 :写完上面的内容后,我意识到Moz的人们更担心引用本身会被别名化-也就是说,其他内容可能指向存储const char *p的同一地址,而不是指向存储该字符的char p分。 But my answer is the same: under the hood, const char *&p means const char **p , and that's subject to the same aliasing issues as any other pointer. 但是我的回答是相同的:在幕后, const char *&p表示const char **p ,并且与其他任何指针一样,也存在相同的别名问题。

How does the compiler know that p_in_out isn't aliased somehow? 编译器如何知道p_in_out没有以某种方式别名? It really can't optimize away writing the data back through the reference. 确实无法优化通过引用回写数据的过程。

struct foo {
    setX(int); setY(int); 
    const char* current_pos;
} x;
parse_foo(x.current_pos, x);

I look at this and ask why you didn't just return the pointer Then you don't have a reference to a pointer and you don't have to worry about modify the original. 我看了一下,问为什么不只返回指针呢?那么您就没有对指针的引用,也不必担心修改原始指针了。

const char* parse_foo(const char* p, foo& out) {
    //use p;
    return p;
}

It also means you can call the function with an rvalue: 这也意味着您可以使用右值调用该函数:

p = parse_foo(p+2, out); 

One thought that comes immediately in mind: exception safety. 立即想到一个想法:异常安全。 If you throw an exception during parsing, the use of a temporary variable is what you should do to provide strong exception safety: Either the function call succeeded completely or it didn't do anything (from a user's perspective). 如果在解析过程中引发异常,则应使用临时变量来提供强大的异常安全性:函数调用已完全成功或没有执行任何操作(从用户角度而言)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM