[英]using normal_distribution in a loop
I'm wondering if there could be a problem with putting normal_distribution
in a loop.我想知道将
normal_distribution
放入循环中是否会出现问题。
Here is the code that uses normal_distribution
in this strange way:这是以这种奇怪的方式使用
normal_distribution
的代码:
std::default_random_engine generator;
//std::normal_distribution<double> distribution(5.0,2.0);
for (int i=0; i<nrolls; ++i) {
std::normal_distribution<double> distribution(5.0,2.0);
float x = distribution(generator);
}
Putting the normal_distribution
object outside the loop may be slightly more efficient than putting it in the loop.将
normal_distribution
对象放在循环之外可能比将其放入循环中效率更高。 When it's inside the loop, the normal_distribution
object may be re-constructed every time, whereas if it's outside the loop it's only constructed once.当它在循环内部时,
normal_distribution
对象可能每次都被重新构造,而如果它在循环外部,它只构造一次。
Based on an analysis of the assembly, declaring distribution
outside the loop is more efficient.根据对程序集的分析,在循环外声明
distribution
更有效。
Let's look at two different functions, along with the corresponding assembly.让我们看看两个不同的函数,以及相应的程序集。 One of them declares
distribution
inside the loop, and the other one declares it outside the loop.其中一个在循环内声明
distribution
,另一个在循环外声明。 To simplify the analysis, they're declared const in both cases, so we (and the compiler) know that the distribution doesn't get modified.为了简化分析,它们在两种情况下都被声明为 const,因此我们(和编译器)知道分布不会被修改。
You can see the complete assembly here.您可以在此处查看完整的程序集。
// This function is here to prevent the compiler from optimizing out the
// loop entirely
void doSomething(std::normal_distribution<double> const& d) noexcept;
void inside_loop(double mean, double sd, int n) {
for(int i = 0; i < n; i++) {
const std::normal_distribution<double> d(mean, sd);
doSomething(d);
}
}
void outside_loop(double mean, double sd, int n) {
const std::normal_distribution<double> d(mean, sd);
for(int i = 0; i < n; i++) {
doSomething(d);
}
}
inside_loop
assembly inside_loop
程序集The assembly for the loop looks like this (compiled with gcc 8.3 at O3 optimization).循环的程序集如下所示(在 O3 优化下使用 gcc 8.3 编译)。
.L3:
movapd xmm2, XMMWORD PTR [rsp]
lea rdi, [rsp+16]
add ebx, 1
mov BYTE PTR [rsp+40], 0
movaps XMMWORD PTR [rsp+16], xmm2
call foo(std::normal_distribution<double> const&)
cmp ebp, ebx
jne .L3
Basically, it - constructs the distribution - invokes foo
with the distribution - tests to see if it should exit the loop基本上,它 - 构建分布 - 使用分布调用
foo
- 测试它是否应该退出循环
outside_loop
assembly outside_loop
程序集Using the same compilation options, outside_loop
just calls foo
repeatedly without re-constructing the distribution.使用相同的编译选项,
outside_loop
只是重复调用foo
而不重新构建分布。 There's fewer instructions, and everything stays within the registers (so no need to access the stack).指令较少,所有内容都保留在寄存器中(因此无需访问堆栈)。
.L12:
mov rdi, rsp
add ebx, 1
call foo(std::normal_distribution<double> const&)
cmp ebp, ebx
jne .L12
Yes.是的。 There are definitely good times to declare variables inside a loop.
在循环中声明变量绝对是个好时机。 If you were modifying
distribution
somehow inside the loop, then it would make sense to reset it every time just by constructing it again.如果您在循环内部以某种方式修改
distribution
,那么每次只需重新构建它就可以重置它。
Furthermore, if you don't ever use a variable outside of a loop, it makes sense to declare it inside the loop just for the purposes of readability.此外,如果您从未在循环外使用变量,那么出于可读性的目的在循环内声明它是有意义的。
Types that fit inside a CPU's registers (so floats, ints, doubles, and small user-defined types) oftentimes have no overhead associated with their construction, and declaring them inside a loop can actually lead to better assembly by simplifying compiler analysis of register allocation.适合 CPU 寄存器的类型(如浮点数、整数、双精度数和小型用户定义类型)通常没有与其构造相关的开销,并且在循环中声明它们实际上可以通过简化编译器对寄存器分配的分析来实现更好的汇编.
Looking at the interface of the normal distribution, there is a member called reset
, who:查看正态分布的界面,有一个成员叫
reset
,他:
resets the internal state of the distribution
重置分布的内部状态
This implies that the distribution may have an internal state.这意味着分布可能具有内部状态。 If it does, then you definitely reset that when you recreate the object at each iteration.
如果是这样,那么当您在每次迭代中重新创建对象时,您肯定会重置它。 Not using it as intended may produce a distribution which is not normal or might be just inefficient.
不按预期使用它可能会产生不正常的分布或可能只是低效的。
What state could it be?可能是什么状态? That is certainly implementation defined.
那当然是实现定义的。 Looking at one implementation from LLVM, the normal distribution is defined around here .
查看 LLVM 的一个实现, 这里定义了正态分布。 More specifically, the
operator()
is here .更具体地说,
operator()
在这里。 Looking at the code, there is certainly some state shared in between subsequent calls.查看代码,在后续调用之间肯定会共享一些状态。 More specifically, at each subsequent call, the state of the boolean variable
_V_hot_
is flipped.更具体地说,在每次后续调用中,布尔变量
_V_hot_
的状态_V_hot_
翻转。 If it is true, significantly less computations are performed and the value of the stored _V_
is used.如果为真,则执行的计算会显着减少,并且会使用存储的
_V_
的值。 If it is false, then _V_
is computed from scratch.如果为假,则从头开始计算
_V_
。
I did not look very deep into why did they choose to do this.我没有深入研究他们为什么选择这样做。 But, looking only at the computations performed, it should be much faster to rely on the internal state.
但是,仅查看执行的计算,依靠内部状态应该会快得多。 While this is only some implementation, it shows that the standard allows the usage of internal state, and in some case it is beneficial.
虽然这只是一些实现,但它表明标准允许使用内部状态,并且在某些情况下是有益的。
Later edit:后期编辑:
The GCC libstdc++ implementation of std::normal_distribution
can be found here .可以在此处找到
std::normal_distribution
的 GCC libstdc++ 实现。 Note that the operator()
calls another function, __generate_impl
, which is defined in a separate file here .请注意,
operator()
调用另一个函数__generate_impl
,该函数在此处的单独文件中定义。 While different, this implementation has the same flag, here named _M_saved_available
that speeds up every other call.虽然不同,但此实现具有相同的标志,此处命名为
_M_saved_available
,可加快所有其他调用的速度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.