简体   繁体   English

ANTLR vs 半熟

[英]ANTLR vs parboiled

What is the difference between ANTLR and parboiled for parsing in Java? ANTLR 和 parboiled 在 Java 中解析有什么区别?

  • Which is easier to use for a beginner in parsing?对于解析初学者来说,哪个更容易使用?
  • Which is more scalable?哪个更具可扩展性? (from simple to complex grammar) (从简单到复杂的语法)
  • Which has better support for AST construction?哪个对AST构建的支持更好?
  • Which produces better error or warning messages for syntax errors?哪个会为语法错误产生更好的错误或警告消息?
  • Which has less problems to contend with?哪个问题较少? (eg left recursion, shift/reduce conflicts, reduce/reduce conflicts) (例如左递归、移位/减少冲突、减少/减少冲突)
  • Comparison with other open source tools also appreciated与其他开源工具的比较也很受欢迎

Parboiled looks like a really cool tool. Parboiled 看起来是一个非常酷的工具。 It might be easier for beginners as it is just pure programming using a "parser combinator" idiom.对于初学者来说可能更容易,因为它只是使用“解析器组合器”习语的纯编程。 I think that this would become very verbose and harder to read, though the Java grammar doesn't look too bad that I see.我认为这会变得非常冗长且难以阅读,尽管 Java 语法在我看来并不算太糟糕。 I cannot comment on its AST construction but ANTLR 4 generates parse trees not a ASTs.我无法评论它的 AST 构造,但 ANTLR 4 生成的解析树不是 AST。 It claims to have good error messages/recovery but that is suspect because it is based upon parser expression grammars, which can only detect errors once the entire input is been seen (worst case).它声称具有良好的错误消息/恢复能力,但这是可疑的,因为它基于解析器表达式语法,只有在看到整个输入(最坏情况)后才能检测错误。 It also cannot identify ambiguities in your grammar (not conflicts, ambiguities).它也无法识别语法中的歧义(不是冲突、歧义)。 Neither tool announces parsing conflicts.这两个工具都不会宣布解析冲突。 ANTLR 4 handles direct left recursion for things like arithmetic expressions but in general neither tool can handle left recursion. ANTLR 4 处理算术表达式之类的直接左递归,但通常这两种工具都不能处理左递归。 ANTLR requires that you use a library for its parser interpreter like parboiled but you must learn to use the tool if you want to have it generate parsers. ANTLR 要求您为其解析器解释器使用一个库,例如 parboiled,但如果您想让它生成解析器,则必须学习使用该工具。 Currently, ANTLR 4 can generate parsers in Java, C#, JavaScript, Python 2, Python 3.目前,ANTLR 4 可以在 Java、C#、JavaScript、Python 2、Python 3 中生成解析器。

Today, Parboiled is mainly scala-tool.今天,Parboiled 主要是 Scala 工具。 So, if you are using scala it may be better solution for most cases.因此,如果您使用的是 Scala,它可能是大多数情况下更好的解决方案。

Ease of use便于使用

ANTLR should be much easier for beginners. ANTLR 对于初学者来说应该容易得多。 It's easier to start with.开始更容易。

  • There's the book about ANTLR.有一关于 ANTLR 的 It's also well described in DSLs in action .它也在DSLs in action 中得到了很好的描述。 And it has better documentation in general.总的来说,它有更好的文档。
  • There are ANTLR plugins for different IDEs.有适用于不同 IDE 的 ANTLR 插件。 They will allow you to see the AST and give you some other support.他们将允许您查看 AST 并为您提供其他支持。

Parboiled is a scala library. Parboiled 是一个 Scala 库。 You will have syntax highlights and type check out of the box.您将有语法高亮显示和类型检查开箱即用。 Parboiled1 works fine in most IDEs. Parboiled1 在大多数 IDE 中都可以正常工作。 Parbiled2 doesn't (will be fixed soon in Idea). Parbiled2 没有(很快会在 Idea 中修复)。 The library uses macro-expressions and the most IDEs are not comfortable with it.该库使用宏表达式,大多数 IDE 不适应它。 That's why you will have everything red.这就是为什么你会看到一切都是红色的。

But both are pretty easy to start with.但两者都非常容易开始。

  • You can try ANTLR from console (please correct me if I'm wrong).您可以从控制台尝试 ANTLR(如果我错了,请纠正我)。
  • You can install sbt add parboiled as a dependency and play in scala console.您可以安装 sbt add parboiled 作为依赖项并在 Scala 控制台中播放。

Scalability可扩展性

In my opinion Parboiled is more scalable.在我看来,Parboiled 更具可扩展性。 Because you are writing scala code.因为您正在编写 Scala 代码。 You can decompose your parser to multiple scala traits and mix them one with another.您可以将解析器分解为多个 scala 特征,并将它们与另一个混合。 You may create DateTime parser, and mix it to LogEvent parser or $PROTOCOL_NAME parser.您可以创建 DateTime 解析器,并将其混合到 LogEvent 解析器或 $PROTOCOL_NAME 解析器。 And reuse them easily.并轻松重复使用它们。 For parboiled1 you can do some naughty things in runtime.对于 parboiled1,您可以在运行时做一些顽皮的事情。 Well, it gives you power.嗯,它给了你力量。 For some cases you can construct parsers on a fly.在某些情况下,您可以即时构建解析器。 For example you have datetime format, defined as string.例如,您有日期时间格式,定义为字符串。 You can read the format string and generate the appropriate parser for it.您可以读取格式字符串并为其生成适当的解析器。 It is possible even for Parboiled2 (which does lot's of stuff during compile time).即使对于 Parboiled2(它在编译时做了很多事情)也是可能的。 I don't know whether it's possible for ANTLR.我不知道 ANTLR 是否可能。

AST AST

I like the Parboiled approach to AST.我喜欢 AST 的 Parboiled 方法。 It expects you to define ADT .它希望您定义ADT So in ideal case you will have an immutable tree of case classes.因此,在理想情况下,您将拥有一个不可变的案例类树。 you may add some 'dsl-like' stuff to your tree nodes.您可以向树节点添加一些“类似 dsl”的内容。 For example you may define "\\" method to your node, which returns a child with specified name.例如,您可以为节点定义“\\”方法,该方法返回具有指定名称的子节点。

case class Node(value: String) {
  ....
  def \ (childName: String): Option[Node] =
    this.children.find(child => child.name == childName)
}

And Then use it:然后使用它:

city \ "3rd street" \ "23"

This makes work with AST much easier.这使得使用 AST 变得更加容易。 I hope it helps.我希望它有帮助。

Using in production在生产中使用

  • If you are using parboiled you have do add it to your dependency list.如果您使用的是 parboiled,则必须将其添加到您的依赖项列表中。 That's all.就这样。 You will have everything working right out the box.您将拥有开箱即用的一切。
  • If you are using ANTLR, you have to generate *.java files first.如果您使用的是 ANTLR,则必须首先生成 *.java 文件。 And regenerate every time you change the grammar.并在每次更改语法时重新生成。 For the most cases grammar is not changed very often.在大多数情况下,语法不会经常改变。 But In my experience I had situations where we've changed grammar every 2days.但根据我的经验,我遇到过每 2 天更改一次语法的情况。 You may not thing that it's a problem though.不过,您可能并不认为这是一个问题。

Well if I have to compare as a developer who have recently used both frameworks as a newbie to parsing frameworks, then I have the below comparison.好吧,如果我必须作为最近使用这两个框架作为解析框架的新手的开发人员进行比较,那么我有以下比较。

ANTLR ANTLR Parboiled蒸熟的
1 1 It has better documentation in general, has its own website, there's a book (The Definitive ANTLR reference by Terrence Parr), have multiple examples available in git.总的来说,它有更好的文档,有自己的网站,有一本书(Terrence Parr 的 The Definitive ANTLR 参考),在 git 中有多个示例。 It has limited documentation, that is only available in git.它的文档有限,仅在 git 中可用。
2 2 There are ANTLR plugins for different IDEs that allows to see the syntax diagram of rules, check parseTree for the inputs.有适用于不同 IDE 的 ANTLR 插件,可以查看规则的语法图,检查输入的 parseTree。 It helps a lot in writing the rules.它对编写规则有很大帮助。 It does not have any plugins for IDEs.它没有任何用于 IDE 的插件。
3 3 It's a java framework, written in java.它是一个用java编写的java框架。 It's a Scala library/framework and is good if we are writing the parser in Scala.它是一个 Scala 库/框架,如果我们在 Scala 中编写解析器就很好。 And Parboiled2 doesn't support java.而 Parboiled2 不支持 java。 So, if we have to use it in java, we need the old Parboiled1.所以,如果我们必须在java中使用它,我们需要旧的Parboiled1。
4 4 In Antlr we write the parsing rules or the grammars separately in .g4 files.在 Antlr 中,我们在 .g4 文件中单独编写解析规则或语法。 We need to generate *.java files corresponding to the grammar first.我们需要先生成对应语法的*.java文件。 And regenerate every time we change the grammar.并在每次我们更改语法时重新生成。 In Parboiled we have to write the parsing rules and grammar in the java itself.在 Parboiled 中,我们必须在 java 本身中编写解析规则和语法。
5 5 In antlr we get the ParseTree (which is similar to AST) by passing the input to the generated *.java antlr classes.在 antlr 中,我们通过将输入传递给生成的 *.java antlr 类来获得 ParseTree(类似于 AST)。 In parboiled we have to use the Abstract Data Types and use the value stack to push & pop the values while writing the grammar to get the AST.在 parboiled 中,我们必须使用抽象数据类型并使用值堆栈来推送和弹出值,同时编写语法以获取 AST。

So, after using the two I find Antlr a bit easier to use and learn.所以,在使用了这两个之后,我发现 Antlr 更容易使用和学习。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM