简体   繁体   English

Smalltalk中同一语句中分配和比较的效率

[英]Efficency of assign-and-compare in the same statement in Smalltalk

A previous SO question raised the issue about which idiom is better in time of execution efficency terms: 之前的SO问题提出了一个问题,即在执行效率方面哪种习语更好:

[ (var := exp) > 0 ] whileTrue: [ ... ]

versus

[ var := exp. 
  var > 0 ] whileTrue: [ ... ]

Intuitively it seems the first form could be more efficient during execution, because it saves fetching one additional statement (second form). 直观地看,第一种形式在执行过程中可能会更有效率,因为它节省了获取一个附加语句的可能性(第二种形式)。 Is this true in most Smalltalks? 在大多数Smalltalks中,这是真的吗?

Trying with two stupid benchmarks: 尝试两个愚蠢的基准测试:

| var acc |
var := 10000.
[ [ (var := var / 2) < 0  ] whileTrue: [ acc := acc + 1 ] ] bench.

| var acc |
var := 10000.
[ [ var := var / 2. var < 0  ] whileTrue: [ acc := acc + 1 ] ] bench

Reveals no major differences between both versions. 揭示两个版本之间没有重大差异。

Any other opinions? 还有其他意见吗?

So the question is: What should I use to achieve a better execution time? 所以问题是: 我应该使用什么来获得更好的执行时间?

temp := <expression>.
temp > 0

or 要么

(temp := <expression>) > 0

In cases like this one, the best way to arrive at a conclusion is to go down one step in the level of abstraction. 在这种情况下,得出结论的最佳方法是将抽象层次降低一级。 In other words, we need a better understanding of what's happening behind the scenes. 换句话说,我们需要更好地了解幕后发生的事情。

The executable part of a CompiledMethod is represented by its bytecodes . CompiledMethod的可执行部分由其字节码表示。 When we save a method, what we are doing is compiling it into a series of low level instructions for the VM to be able to execute the method every time it is invoked. 保存方法时,我们正在做的就是编译为一系列低级指令,以使VM能够在每次调用时执行该方法。 So, let's take a look at the bytecodes of each one of the cases above. 因此,让我们看一下上述每种情况的字节码。

Since <expression> is the same in the same in both cases, let's reduce it drastically to eliminate noise. 由于在两种情况下<expression>都相同,因此让我们大幅度减小它以消除噪声。 Also, let's put our code in a method so to have a CompiledMethod to play with 另外,让我们将代码放入方法中,以便使用CompiledMethod

Object >> m
  | temp |
  temp := 1.
  temp > 0

Now, let's look CompiledMethod and its superclasses for some message that would show us the bytecodes of Object >> #m . 现在,让我们看一下CompiledMethod及其超类,以获取一些消息,这些消息将向我们显示Object >> #m的字节码。 The selector should contain the subword bytecodes, right? 选择器应包含子字字节码,对吗?

... ...

Here it is #symbolicBytecodes ! 这是#symbolicBytecodes Now let's evaluate (Object >> #m) symbolicBytecodes to get: 现在让我们评估(Object >> #m) symbolicBytecodes以获取:

pushConstant: 1
popIntoTemp: 0
pushTemp: 0
pushConstant: 0
send: >
pop
returnSelf

Note by the way how our temp variable has been renamed to Temp: 0 in the bytecodes language. 请注意,字节码语言中的temp变量如何重命名为Temp: 0

Now repeat with the other and get: 现在与另一个重复,得到:

pushConstant: 1
storeIntoTemp: 0
pushConstant: 0
send: >
pop
returnSelf

The difference is 区别是

popIntoTemp: 0
pushTemp: 0

versus

storeIntoTemp: 0

What this reveals is that in both cases temp is read from the stack in different ways. 这说明在两种情况temp ,都以不同的方式从堆栈中读取temp In the first case, the result of our <expression> is popped into temp from the execution stack and then temp is pushed again to restore the stack. 在第一种情况下, <expression>的结果从执行堆栈弹出到temp ,然后再次压入temp以恢复堆栈。 A pop followed by a push of the same thing. 一声pop然后push同样的东西。 In the second case, instead, no push or pop happens and temp is simply read from the stack. 相反,在第二种情况下,不会发生pushpop ,而只是从堆栈中读取temp

So the conclusion is that in the first case we will be generating two cancelling instructions pop followed by push . 因此得出的结论是,在第一种情况下,我们将生成两个取消指令pop后跟push

This also explains why the difference is so hard to measure: push and pop instructions have direct translations into machine code and the CPU will execute them really fast. 这也解释了为什么差异如此难以衡量: pushpop指令可以直接翻译成机器代码,而CPU会非常快地执行它们。

Note however, that nothing prevents the compiler to automatically optimize the code and realize that in fact pop + push is equivalent to storeInto . 但是请注意,没有什么阻止编译器自动优化代码并意识到实际上pop + push等效于storeInto With such an optimization both Smalltalk snippets would result in exactly the same machine code. 通过这种优化,两个Smalltalk片段都将产生完全相同的机器代码。

Now, you should be able to decide which form do you prefer. 现在,您应该能够决定自己喜欢哪种形式。 I my opinion such a decision should only take into account the programming style that you like better. 我认为这样的决定只应考虑您更喜欢的编程风格。 Taking into consideration the execution time is irrelevant because the difference is minimal, and could be easily reduced to zero by implementing the optimization we just discussed. 考虑到执行时间是无关紧要的,因为差异很小,并且可以通过实施刚刚讨论的优化轻松地将其减少为零。 By the way, that would be an excellent exercise for those willing to understand the low level realms of the unparalleled Smalltalk language. 顺便说一句,对于那些愿意了解无与伦比的Smalltalk语言的低级领域的人来说,这将是一个极好的练习。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM