简体   繁体   English

Scala宏和JVM的方法大小限制

[英]Scala macros and the JVM's method size limit

I'm replacing some code generation components in a Java program with Scala macros, and am running into the Java Virtual Machine's limit on the size of the generated byte code for individual methods (64 kilobytes). 我正在用Scala宏替换Java程序中的一些代码生成组件,并且运行Java虚拟机对单个方法生成的字节代码大小的限制(64千字节)。

For example, suppose we have a large-ish XML file that represents a mapping from integers to integers that we want to use in our program. 例如,假设我们有一个大型的XML文件,它表示我们想要在程序中使用的从整数到整数的映射。 We want to avoid parsing this file at run time, so we'll write a macro that will do the parsing at compile time and use the contents of the file to create the body of our method: 我们希望避免在运行时解析此文件,因此我们将编写一个宏,它将在编译时进行解析并使用该文件的内容来创建方法的主体:

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  val mapping = List.tabulate(7000)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    c.Expr(Match(Annotated(switch, i.tree), cases))
  }
}

In this case the compiled method would be just over the size limit, but instead of a nice error saying that, we're given a giant stack trace with a lot of calls to TreePrinter.printSeq and are told that we've slain the compiler. 在这种情况下,编译的方法将超过大小限制,但不是一个很好的错误说,我们给了一个巨大的堆栈跟踪与大量的TreePrinter.printSeq调用, TreePrinter.printSeq告知我们已经杀了编译器。

I have a solution that involves splitting the cases into fixed-sized groups, creating a separate method for each group, and adding a top-level match that dispatches the input value to the appropriate group's method. 我有一个解决方案 ,涉及将案例拆分为固定大小的组,为每个组创建一个单独的方法,并添加一个顶级匹配,将输入值调度到适当的组的方法。 It works, but it's unpleasant, and I'd prefer not to have to use this approach every time I write a macro where the size of the generated code depends on some external resource. 它可以工作,但它很不愉快,而且我不想每次编写宏时都不必使用这种方法,其中生成的代码的大小取决于某些外部资源。

Is there a cleaner way to tackle this problem? 有没有更清洁的方法来解决这个问题? More importantly, is there a way to deal with this kind of compiler error more gracefully? 更重要的是,有没有办法更优雅地处理这种编译器错误? I don't like the idea of a library user getting an unintelligible "That entry seems to have slain the compiler" error message just because some XML file that's being processed by a macro has crossed some (fairly low) size threshhold. 我不喜欢库用户得到一个难以理解的“该条目似乎已经杀死了编译器”错误消息的想法,因为一些宏正在处理的XML文件已超过一些(相当低的)大小阈值。

Imo putting data into .class isn't really a good idea. Imo将数据放入.class并不是一个好主意。 They are parsed as well, they're just binary. 它们也被解析,它们只是二进制的。 But storing them in JVM may have negative impact on performance of the garbagge collector and JIT compiler. 但是将它们存储在JVM中可能会对garbagge收集器和JIT编译器的性能产生负面影响。

In your situation, I would pre-compile the XML into a binary file of proper format and parse that. 在您的情况下,我会将XML预编译为适当格式的二进制文件并解析它。 Elligible formats with existing tooling can be eg FastRPC or good old DBF . 使用现有工具的可格式格式可以是例如FastRPC或良好的旧DBF Or maybe pre-fill an ElasticSearch repository if you need quick advanced lookups and searches. 或者,如果您需要快速高级查找和搜索,可以预填充ElasticSearch存储库。 Some implementations of the latter may also provide basic indexing which could even leave the parsing out - the app would just read from the respective offset. 后者的一些实现还可以提供基本索引,甚至可以使解析 - 应用程序只读取相应的偏移量。

Since somebody has to say something, I followed the instructions at Importers to try to compile the tree before returning it. 由于有人要说些什么,我按照Importers的说明尝试在返回之前编译树。

If you give the compiler plenty of stack, it will correctly report the error. 如果给编译器足够的堆栈,它将正确报告错误。

(It didn't seem to know what to do with the switch annotation, left as a future exercise.) (它似乎不知道如何处理切换注释,留作未来的练习。)

apm@mara:~/tmp/bigmethod$ skalac bigmethod.scala ; skalac -J-Xss2m biguser.scala ; skala bigmethod.Test
Error is java.lang.RuntimeException: Method code too large!
Error is java.lang.RuntimeException: Method code too large!
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^
one error found

as opposed to 而不是

apm@mara:~/tmp/bigmethod$ skalac -J-Xss1m biguser.scala 
Error is java.lang.StackOverflowError
Error is java.lang.StackOverflowError
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^

where the client code is just that: 客户端代码就是这样的:

package bigmethod

object Test extends App {
  Console println s"5 => ${BigMethod.lookup(5)}"
}

My first time using this API, but not my last. 我第一次使用这个API,但不是我的最后一次。 Thanks for getting me kickstarted. 谢谢你让我开心。

package bigmethod

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  //final val size = 700
  final val size = 7000
  val mapping = List.tabulate(size)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {

    def compilable[T](x: c.Expr[T]): Boolean = {
      import scala.reflect.runtime.{ universe => ru }
      import scala.tools.reflect._
      //val mirror = ru.runtimeMirror(c.libraryClassLoader)
      val mirror = ru.runtimeMirror(getClass.getClassLoader)
      val toolbox = mirror.mkToolBox()
      val importer0 = ru.mkImporter(c.universe)
      type ruImporter = ru.Importer { val from: c.universe.type }
      val importer = importer0.asInstanceOf[ruImporter]
      val imported = importer.importTree(x.tree)
      val tree = toolbox.resetAllAttrs(imported.duplicate)
      try {
        toolbox.compile(tree)
        true
      } catch {
        case t: Throwable =>
          Console println s"Error is $t"
          false
      }
    }
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    //val res = c.Expr(Match(Annotated(switch, i.tree), cases))
    val res = c.Expr(Match(i.tree, cases))

    // before returning a potentially huge tree, try compiling it
    //import scala.tools.reflect._
    //val x = c.Expr[Int](c.resetAllAttrs(res.tree.duplicate))
    //val y = c.eval(x)
    if (!compilable(res)) c.abort(c.enclosingPosition, "You ask too much of me.")

    res
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM