简体   繁体   English

从自定义源代码生成xml

[英]generate xml from custom source code

I have a source code of new kind of small programming language; 我有一种新型的小型编程语言的源代码。

 method M(n: int) returns (r: int)
  ensures r == n;
{
  var i := 0;
  while (i < n)
  {
    i := i + 1;
  }
  r := i;
}

I want to read this source file of this code (just one file without any dependencies) using Java and create XML for function name, input parameters,return types,keyword ensures etc. 我想使用Java读取此代码的源文件(只是一个没有任何依赖关系的文件),并为函数名称,输入参数,返回类型,关键字确保等创建XML。

In order to do that, I need to analyse given source code maybe create a kind of a tree structure to see hierarchical view. 为了做到这一点,我需要分析给定的源代码,也许创建一种树形结构以查看分层视图。 (at least I am thinking that way) (至少我是这样想的)

Is there any kind of framework that could help me to customize the keywords in order to analyse this kind of material and generate XML out of it or should I just read this file line by line and try to create XML parser by myself. 是否有任何框架可以帮助我自定义关键字,以便分析此类材料并从中生成XML,或者我应该逐行阅读此文件并尝试自己创建XML解析器。

My main purpose in here to represent this code in XML format.In order to generate some UML kind diagrams.I am not aiming to create new compiler or language. 我在这里的主要目的是以XML格式表示此代码。为了生成一些UML种类图。我的目的不是创建新的编译器或语言。 (my question was not clear enough I hope this makes it more clear) (我的问题还不够清楚,我希望这可以使它更清楚)

Your description is a bit vague, but it sounds like you're looking for a library for parsing a custom language and converting into another language. 您的描述有点含糊,但是听起来您正在寻找一个用于解析自定义语言并将其转换为另一种语言的库。 You might start with ANTLR . 您可能从ANTLR开始。 Also, if you are building Java objects from your input, you might consider JAX-B for marshalling to XML. 同样,如果要从输入中构建Java对象,则可以考虑将JAX-B编组为XML。

You can use for that the parser generator ANTLR . 您可以使用解析器生成器ANTLR The process is to define the language as a grammar consisting of rules. 过程是将语言定义为由规则组成的语法。 ANTLR uses a EBNF form for that. ANTLR为此使用EBNF表单。 If the parser can derive a rule, you can specify an action in Java what to do, in your case to write some XML tags to a stream. 如果解析器可以派生规则,则可以在Java中指定要执行的操作,以将一些XML标记写入流。

Before you can think about generating an XML file, the first part in doing what you discussed would definitely be to parse the input document. 在考虑生成XML文件之前,进行讨论的第一部分肯定是解析输入文档。 Now, regexes are not a good candidate for that job. 现在,正则表达式不是该工作的理想选择。 And hand made parsers are difficult to conceive, especially for languages that support some form of operator predeceence. 手工解析器很难构思,尤其是对于支持某种形式的操作符的语言。

Here are three good libraries to develop parsers for whatever language you may design. 这里有三个不错的库,可以为您设计的任何语言开发解析器。 They are not all equivalent, though, so choosing either of them should be guided by the kind of language you are designing. 但是,它们并不完全相同,因此,应根据所设计的语言来选择它们之一。

Using any of these, you will describe your language structure and keywords, then code to be run when each element is found. 使用任何这些,您将描述您的语言结构和关键字,然后在找到每个元素时运行代码。 You will then add code to create a parse tree (or you may let the engine generate one for you). 然后,您将添加代码以创建一个解析树(或者可以让引擎为您生成一个)。 Then, you may write code to work upon that parse tree, and possibly, a visitor to output it to XML. 然后,您可能会编写代码以对该解析树进行操作,并且可能还会有一个访问者将其输出到XML。

By the way, if the exact structure of your language is still undefined, then you may actually use any of the previous "parser generator" tool. 顺便说一句,如果您的语言的确切结构仍然不确定,那么您实际上可以使用任何以前的“解析器生成器”工具。 In that case, if you are an actual user of Eclipse, then I might suggest that you try XText first, as it will generate an Eclipse editor, with autocompletion support, refactoring support, etc. All for free. 在那种情况下,如果您是Eclipse的实际用户,那么我建议您先尝试XText,因为它将生成带有自动完成支持,重构支持等的Eclipse编辑器。全部免费。

Update : XText can also be used to generate a graphic editor for your language, provided that it make sense. 更新 :XText也可以用于为您的语言生成图形编辑器,只要它有意义。 Have a look here for an example: http://vimeo.com/12824804 . 在这里查看示例: http : //vimeo.com/12824804

This is not a trivial subject (if you want to do it right). 这不是一个琐碎的主题(如果您想做对的话)。 You are going to need to do most of the stages of writing a compiler (minus the actual writing out machine code part). 您将需要完成编写编译器的大部分步骤(减去实际写出的机器代码部分)。

See this thread for lots of info to get started: Learning to write a compiler 请参阅此线程以获取大量入门信息: 学习编写编译器

Making a compiler is a really rewarding experience, but it is a lot of work. 编写编译器是一次非常有意义的经历,但这是很多工作。

Once you create a parse tree, you'll be able to export it to XML. 创建解析树后,便可以将其导出到XML。 But that part will come a lot later. 但是这部分会在以后出现。

Assuming that ONLY the header line of each method is important, here is a completely different strategy. 假设仅每种方法的标题行很重要,那么这是一种完全不同的策略。

read a line from your input file
    if (line match regex /^ \s* method ([a-zA-Z][a-zA-Z0-9_]*)\(([^)]*)\) returns \(([^)]*)\) /x )
        // So the line is a method header. Extract arguments
        currentMethodName = group(1);
        currentArguments = group(2);
        currentReturnType = group(3);

        methods.add(new MethodDefinition(...));
    end if


for (method : methods) {
    // Generate XML for that method...
}

Is this approach more suited to your expectations and needs? 这种方法是否更适合您的期望和需求?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM