[英]Efficency of assign-and-compare in the same statement in Smalltalk
A previous SO question raised the issue about which idiom is better in time of execution efficency terms: 之前的SO问题提出了一个问题,即在执行效率方面哪种习语更好:
[ (var := exp) > 0 ] whileTrue: [ ... ]
versus 与
[ var := exp.
var > 0 ] whileTrue: [ ... ]
Intuitively it seems the first form could be more efficient during execution, because it saves fetching one additional statement (second form). 直观地看,第一种形式在执行过程中可能会更有效率,因为它节省了获取一个附加语句的可能性(第二种形式)。 Is this true in most Smalltalks? 在大多数Smalltalks中,这是真的吗?
Trying with two stupid benchmarks: 尝试两个愚蠢的基准测试:
| var acc |
var := 10000.
[ [ (var := var / 2) < 0 ] whileTrue: [ acc := acc + 1 ] ] bench.
| var acc |
var := 10000.
[ [ var := var / 2. var < 0 ] whileTrue: [ acc := acc + 1 ] ] bench
Reveals no major differences between both versions. 揭示两个版本之间没有重大差异。
Any other opinions? 还有其他意见吗?
So the question is: What should I use to achieve a better execution time? 所以问题是: 我应该使用什么来获得更好的执行时间?
temp := <expression>.
temp > 0
or 要么
(temp := <expression>) > 0
In cases like this one, the best way to arrive at a conclusion is to go down one step in the level of abstraction. 在这种情况下,得出结论的最佳方法是将抽象层次降低一级。 In other words, we need a better understanding of what's happening behind the scenes. 换句话说,我们需要更好地了解幕后发生的事情。
The executable part of a CompiledMethod
is represented by its bytecodes . CompiledMethod
的可执行部分由其字节码表示。 When we save a method, what we are doing is compiling it into a series of low level instructions for the VM to be able to execute the method every time it is invoked. 保存方法时,我们正在做的就是将其编译为一系列低级指令,以使VM能够在每次调用时执行该方法。 So, let's take a look at the bytecodes of each one of the cases above. 因此,让我们看一下上述每种情况的字节码。
Since <expression>
is the same in the same in both cases, let's reduce it drastically to eliminate noise. 由于在两种情况下<expression>
都相同,因此让我们大幅度减小它以消除噪声。 Also, let's put our code in a method so to have a CompiledMethod
to play with 另外,让我们将代码放入方法中,以便使用CompiledMethod
Object >> m
| temp |
temp := 1.
temp > 0
Now, let's look CompiledMethod
and its superclasses for some message that would show us the bytecodes of Object >> #m
. 现在,让我们看一下CompiledMethod
及其超类,以获取一些消息,这些消息将向我们显示Object >> #m
的字节码。 The selector should contain the subword bytecodes, right? 选择器应包含子字字节码,对吗?
... ...
Here it is #symbolicBytecodes
! 这是#symbolicBytecodes
! Now let's evaluate (Object >> #m) symbolicBytecodes
to get: 现在让我们评估(Object >> #m) symbolicBytecodes
以获取:
pushConstant: 1
popIntoTemp: 0
pushTemp: 0
pushConstant: 0
send: >
pop
returnSelf
Note by the way how our temp
variable has been renamed to Temp: 0
in the bytecodes language. 请注意,字节码语言中的temp
变量如何重命名为Temp: 0
。
Now repeat with the other and get: 现在与另一个重复,得到:
pushConstant: 1
storeIntoTemp: 0
pushConstant: 0
send: >
pop
returnSelf
The difference is 区别是
popIntoTemp: 0
pushTemp: 0
versus 与
storeIntoTemp: 0
What this reveals is that in both cases temp
is read from the stack in different ways. 这说明在两种情况temp
,都以不同的方式从堆栈中读取temp
。 In the first case, the result of our <expression>
is popped into temp
from the execution stack and then temp
is pushed again to restore the stack. 在第一种情况下, <expression>
的结果从执行堆栈弹出到temp
,然后再次压入temp
以恢复堆栈。 A pop
followed by a push
of the same thing. 一声pop
然后push
同样的东西。 In the second case, instead, no push
or pop
happens and temp
is simply read from the stack. 相反,在第二种情况下,不会发生push
或pop
,而只是从堆栈中读取temp
。
So the conclusion is that in the first case we will be generating two cancelling instructions pop
followed by push
. 因此得出的结论是,在第一种情况下,我们将生成两个取消指令pop
后跟push
。
This also explains why the difference is so hard to measure: push
and pop
instructions have direct translations into machine code and the CPU will execute them really fast. 这也解释了为什么差异如此难以衡量: push
和pop
指令可以直接翻译成机器代码,而CPU会非常快地执行它们。
Note however, that nothing prevents the compiler to automatically optimize the code and realize that in fact pop + push
is equivalent to storeInto
. 但是请注意,没有什么阻止编译器自动优化代码并意识到实际上pop + push
等效于storeInto
。 With such an optimization both Smalltalk snippets would result in exactly the same machine code. 通过这种优化,两个Smalltalk片段都将产生完全相同的机器代码。
Now, you should be able to decide which form do you prefer. 现在,您应该能够决定自己喜欢哪种形式。 I my opinion such a decision should only take into account the programming style that you like better. 我认为这样的决定只应考虑您更喜欢的编程风格。 Taking into consideration the execution time is irrelevant because the difference is minimal, and could be easily reduced to zero by implementing the optimization we just discussed. 考虑到执行时间是无关紧要的,因为差异很小,并且可以通过实施刚刚讨论的优化轻松地将其减少为零。 By the way, that would be an excellent exercise for those willing to understand the low level realms of the unparalleled Smalltalk language. 顺便说一句,对于那些愿意了解无与伦比的Smalltalk语言的低级领域的人来说,这将是一个极好的练习。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.