[英]How is pattern matching in Scala implemented at the bytecode level?
How is pattern matching in Scala implemented at the bytecode level? Scala 中的模式匹配是如何在字节码级别实现的?
Is it like a series of if (x instanceof Foo)
constructs, or something else?它像一系列
if (x instanceof Foo)
构造,还是其他什么? What are its performance implications?它的性能影响是什么?
For example, given the following code (from Scala By Example pages 46-48), how would the equivalent Java code for the eval
method look like?例如,给定以下代码(来自Scala By Example第 46-48 页),
eval
方法的等效 Java 代码是什么样的?
abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr
def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}
PS I can read Java bytecode, so a bytecode representation would be good enough for me, but probably it would be better for the other readers to know how it would look like as Java code. PS 我可以阅读 Java 字节码,所以字节码表示对我来说已经足够了,但对于其他读者来说,了解它作为 Java 代码的样子可能会更好。
PPS Does the book Programming in Scala give an answer to this and similar questions about how Scala is implemented? PPS Scala 编程这本书是否回答了这个问题以及关于 Scala 是如何实现的类似问题? I have ordered the book, but it has not yet arrived.
我已经订购了这本书,但还没有到。
The low level can be explored with a disassembler but the short answer is that it's a bunch of if/elses where the predicate depends on the pattern可以使用反汇编器探索低级别,但简短的回答是它是一堆 if/else,其中谓词取决于模式
case Sum(l,r) // instance of check followed by fetching the two arguments and assigning to two variables l and r but see below about custom extractors
case "hello" // equality check
case _ : Foo // instance of check
case x => // assignment to a fresh variable
case _ => // do nothing, this is the tail else on the if/else
There's much more that you can do with patterns like or patterns and combinations like "case Foo(45, x)", but generally those are just logical extensions of what I just described.你可以用像“case Foo(45, x)”这样的模式或模式和组合做更多的事情,但通常这些只是我刚刚描述的逻辑扩展。 Patterns can also have guards, which are additional constraints on the predicates.
模式也可以有守卫,这是对谓词的额外约束。 There are also cases where the compiler can optimize pattern matching, eg when there's some overlap between cases it might coalesce things a bit.
也有编译器可以优化模式匹配的情况,例如,当情况之间有一些重叠时,它可能会合并一些东西。 Advanced patterns and optimization are an active area of work in the compiler, so don't be surprised if the byte code improves substantially over these basic rules in current and future versions of Scala.
高级模式和优化是编译器中的一个活跃工作领域,因此如果字节码在当前和未来版本的 Scala 中比这些基本规则有显着改进,请不要感到惊讶。
In addition to all that, you can write your own custom extractors in addition to or instead of the default ones Scala uses for case classes.除此之外,您可以编写自己的自定义提取器,以补充或代替 Scala 用于案例类的默认提取器。 If you do, then the cost of the pattern match is the cost of whatever the extractor does.
如果这样做,那么模式匹配的成本就是提取器所做的任何事情的成本。 A good overview is found in http://lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf
在http://lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf 中有一个很好的概述
James (above) said it best.詹姆斯(上)说得最好。 However, if you're curious it's always a good exercise to look at the disassembled bytecode.
但是,如果您很好奇,查看反汇编的字节码总是一个很好的练习。 You can also invoke
scalac
with the -print
option, which will print your program with all Scala-specific features removed.您还可以使用
-print
选项调用scalac
,该选项将在删除所有 Scala 特定功能的情况下打印您的程序。 It's basically Java in Scala's clothing.它基本上是披着 Scala 外衣的 Java。 Here's the relevant
scalac -print
output for the code snippet you gave:这是您提供的代码片段的相关
scalac -print
输出:
def eval(e: Expr): Int = {
<synthetic> val temp10: Expr = e;
if (temp10.$isInstanceOf[Number]())
temp10.$asInstanceOf[Number]().n()
else
if (temp10.$isInstanceOf[Sum]())
{
<synthetic> val temp13: Sum = temp10.$asInstanceOf[Sum]();
Main.this.eval(temp13.e1()).+(Main.this.eval(temp13.e2()))
}
else
throw new MatchError(temp10)
};
Since version 2.8, Scala has had the @switch annotation.从 2.8 版开始,Scala 有了@switch注释。 The goal is to ensure, that pattern matching will be compiled into tableswitch or lookupswitch instead of series of conditional
if
statements.目标是确保模式匹配将被编译到tableswitch 或 lookupswitch 中,而不是一系列条件
if
语句。
To expand on @Zifre's comment: if you are reading this in the future and the scala compiler has adopted new compilation strategies and you want to know what they are, here's how you find out what it does.扩展@Zifre 的评论:如果您将来正在阅读本文,并且 Scala 编译器采用了新的编译策略,并且您想知道它们是什么,那么您可以通过以下方式了解它的作用。
Copy-paste your match
code into a self-contained example file.将您的
match
代码复制粘贴到一个独立的示例文件中。 Run scalac
on that file.在该文件上运行
scalac
。 Then run javap -v -c theClassName$.class
.然后运行
javap -v -c theClassName$.class
。
For example, I put the following into /tmp/question.scala
:例如,我将以下内容放入
/tmp/question.scala
:
object question {
abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr
def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}
}
Then I ran scalac question.scala
, which produced a bunch of *.class
files.然后我运行
scalac question.scala
,它产生了一堆*.class
文件。 Poking around a bit, I found the match statement inside question$.class
.稍微翻了一下,我在
question$.class
找到了 match 语句。 The javap -c -v question$.class
output is available below.下面提供了
javap -c -v question$.class
输出。
Since we're looking for a condition control flow construct, knowing about the java bytecode instruction set suggests that looking for "if" should be a good place to start.由于我们正在寻找条件控制流构造,因此了解 java 字节码指令集表明寻找“if”应该是一个很好的起点。
In two locations we find a pair of consecutive lines on the form isinstanceof <something>; ifeq <somewhere>
在两个位置,我们在表单
isinstanceof <something>; ifeq <somewhere>
上找到了一对连续的行isinstanceof <something>; ifeq <somewhere>
isinstanceof <something>; ifeq <somewhere>
, which means: if the most recently computed value is not an instance of something
then goto somewhere
. isinstanceof <something>; ifeq <somewhere>
,这意味着:如果最近计算的值不是something
的实例,则转到somewhere
。 ( ifeq
is jump if zero
, and isinstanceof
gives you a zero to represent false.) (
ifeq
是jump if zero
, isinstanceof
给你一个零来表示假。)
If you follow the control flow around, you'll see that it agrees with the answer given by @Jorge Ortiz: we do if (blah isinstanceof something) { ... } else if (blah isinstanceof somethingelse) { ... }
.如果您遵循控制流程,您会发现它与@Jorge Ortiz 给出的答案一致:我们这样做
if (blah isinstanceof something) { ... } else if (blah isinstanceof somethingelse) { ... }
。
Here is the javap -c -v question$.class
output:这是
javap -c -v question$.class
输出:
Classfile /tmp/question$.class
Last modified Nov 20, 2020; size 956 bytes
MD5 checksum cfc788d4c847dad0863a797d980ad2f3
Compiled from "question.scala"
public final class question$
minor version: 0
major version: 50
flags: (0x0031) ACC_PUBLIC, ACC_FINAL, ACC_SUPER
this_class: #2 // question$
super_class: #4 // java/lang/Object
interfaces: 0, fields: 1, methods: 3, attributes: 4
Constant pool:
#1 = Utf8 question$
#2 = Class #1 // question$
#3 = Utf8 java/lang/Object
#4 = Class #3 // java/lang/Object
#5 = Utf8 question.scala
#6 = Utf8 MODULE$
#7 = Utf8 Lquestion$;
#8 = Utf8 <clinit>
#9 = Utf8 ()V
#10 = Utf8 <init>
#11 = NameAndType #10:#9 // "<init>":()V
#12 = Methodref #2.#11 // question$."<init>":()V
#13 = Utf8 eval
#14 = Utf8 (Lquestion$Expr;)I
#15 = Utf8 question$Number
#16 = Class #15 // question$Number
#17 = Utf8 n
#18 = Utf8 ()I
#19 = NameAndType #17:#18 // n:()I
#20 = Methodref #16.#19 // question$Number.n:()I
#21 = Utf8 question$Sum
#22 = Class #21 // question$Sum
#23 = Utf8 e1
#24 = Utf8 ()Lquestion$Expr;
#25 = NameAndType #23:#24 // e1:()Lquestion$Expr;
#26 = Methodref #22.#25 // question$Sum.e1:()Lquestion$Expr;
#27 = Utf8 e2
#28 = NameAndType #27:#24 // e2:()Lquestion$Expr;
#29 = Methodref #22.#28 // question$Sum.e2:()Lquestion$Expr;
#30 = NameAndType #13:#14 // eval:(Lquestion$Expr;)I
#31 = Methodref #2.#30 // question$.eval:(Lquestion$Expr;)I
#32 = Utf8 scala/MatchError
#33 = Class #32 // scala/MatchError
#34 = Utf8 (Ljava/lang/Object;)V
#35 = NameAndType #10:#34 // "<init>":(Ljava/lang/Object;)V
#36 = Methodref #33.#35 // scala/MatchError."<init>":(Ljava/lang/Object;)V
#37 = Utf8 this
#38 = Utf8 e
#39 = Utf8 Lquestion$Expr;
#40 = Utf8 x
#41 = Utf8 I
#42 = Utf8 l
#43 = Utf8 r
#44 = Utf8 question$Expr
#45 = Class #44 // question$Expr
#46 = Methodref #4.#11 // java/lang/Object."<init>":()V
#47 = NameAndType #6:#7 // MODULE$:Lquestion$;
#48 = Fieldref #2.#47 // question$.MODULE$:Lquestion$;
#49 = Utf8 question
#50 = Class #49 // question
#51 = Utf8 Sum
#52 = Utf8 Expr
#53 = Utf8 Number
#54 = Utf8 Code
#55 = Utf8 LocalVariableTable
#56 = Utf8 LineNumberTable
#57 = Utf8 StackMapTable
#58 = Utf8 SourceFile
#59 = Utf8 InnerClasses
#60 = Utf8 ScalaInlineInfo
#61 = Utf8 Scala
{
public static final question$ MODULE$;
descriptor: Lquestion$;
flags: (0x0019) ACC_PUBLIC, ACC_STATIC, ACC_FINAL
public static {};
descriptor: ()V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: new #2 // class question$
3: invokespecial #12 // Method "<init>":()V
6: return
public int eval(question$Expr);
descriptor: (Lquestion$Expr;)I
flags: (0x0001) ACC_PUBLIC
Code:
stack=3, locals=9, args_size=2
0: aload_1
1: astore_2
2: aload_2
3: instanceof #16 // class question$Number
6: ifeq 27
9: aload_2
10: checkcast #16 // class question$Number
13: astore_3
14: aload_3
15: invokevirtual #20 // Method question$Number.n:()I
18: istore 4
20: iload 4
22: istore 5
24: goto 69
27: aload_2
28: instanceof #22 // class question$Sum
31: ifeq 72
34: aload_2
35: checkcast #22 // class question$Sum
38: astore 6
40: aload 6
42: invokevirtual #26 // Method question$Sum.e1:()Lquestion$Expr;
45: astore 7
47: aload 6
49: invokevirtual #29 // Method question$Sum.e2:()Lquestion$Expr;
52: astore 8
54: aload_0
55: aload 7
57: invokevirtual #31 // Method eval:(Lquestion$Expr;)I
60: aload_0
61: aload 8
63: invokevirtual #31 // Method eval:(Lquestion$Expr;)I
66: iadd
67: istore 5
69: iload 5
71: ireturn
72: new #33 // class scala/MatchError
75: dup
76: aload_2
77: invokespecial #36 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
80: athrow
LocalVariableTable:
Start Length Slot Name Signature
0 81 0 this Lquestion$;
0 81 1 e Lquestion$Expr;
20 61 4 x I
47 34 7 l Lquestion$Expr;
54 27 8 r Lquestion$Expr;
LineNumberTable:
line 6: 0
line 7: 2
line 8: 27
line 6: 69
StackMapTable: number_of_entries = 3
frame_type = 252 /* append */
offset_delta = 27
locals = [ class question$Expr ]
frame_type = 254 /* append */
offset_delta = 41
locals = [ top, top, int ]
frame_type = 248 /* chop */
offset_delta = 2
}
SourceFile: "question.scala"
InnerClasses:
public static #51= #22 of #50; // Sum=class question$Sum of class question
public static abstract #52= #45 of #50; // Expr=class question$Expr of class question
public static #53= #16 of #50; // Number=class question$Number of class question
ScalaInlineInfo: length = 0xE (unknown attribute)
01 01 00 02 00 0A 00 09 01 00 0D 00 0E 01
Scala: length = 0x0 (unknown attribute)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.