简体   繁体   English

如何解析类似INI / JSON的非标准格式的文件?

[英]How to parse a file in INI/JSON-like non-standard format?

Suppose I have a text file in the following (non-standard) format: 假设我有以下(非标准)格式的文本文件:

xxx { a = v1; b = v2 }
yyy { a = v3; c = v4 }

I cannot change it to any standard (INI/XML/YAML, etc.) format. 我无法将其更改为任何标准(INI / XML / YAML等)格式。

Now I would like to find the value of property a in section xxx (that is v1 ). 现在,我想在xxx节(即v1 )中找到属性a的值。 What is the simplest way to do it in Java/Groovy? 在Java / Groovy中最简单的方法是什么?

With Groovy, you could leverage the ConfigSlurper. 使用Groovy,您可以利用ConfigSlurper。

However, you would first need to hack a map of valid values together, so that it doesn't choke trying to work out what v1, v2, v3, etc are: 但是,您首先需要将有效值映射在一起,以免试图找出v1, v2, v3, etc是什么:

This seems to work: 这似乎可行:

def input = '''xxx { a = v1; b = v2 }
              |yyy { a = v3; c = v4 }'''.stripMargin()

def slurper = new ConfigSlurper()

// Find all words 'w' and make a map of [ w1:'w1', w2:'w2', ... ]
slurper.binding = ( ( input =~ /\w+/ ) as List ).collectEntries { w -> [ (w):w ] }

def result = slurper.parse( input )
println result

That prints out: 打印出来:

[xxx:[a:v1, b:v2], yyy:[a:v3, c:v4]]

(Groovy 1.8.4) (Groovy 1.8.4)

Firstly, you've given an example, not specified a format. 首先,您给出了一个示例,未指定格式。 Before you go any further, you need to get hold of a complete specification for the format. 在继续之前,您需要掌握该格式的完整规范。 Or if there isn't one, you need to see the code that generates it, and reverse engineer a specification. 或者,如果没有,则需要查看生成它的代码,并对规范进行反向工程。

(If you try to implement based on a small example, there's a good chance that your parser will encounter real life examples that don't fit the patterns that you have intuited.) (如果您尝试根据一个小例子来实现,那么解析器很有可能会遇到不符合您直觉的模式的现实例子。)

Having done that you can look for an off-the-shelf parser that can cope with your format. 完成后,您可以寻找可以处理您的格式的现成解析器。 If you are lucky, it might be close enough to INI, or JSON or YAML or something else for the corresponding parser to (mostly) work. 如果幸运的话,它可能与INI,JSON或YAML或其他足够接近的解析器足够接近(主要)起作用。

But the chances are that it won't, and that you will need to write your own parser. 但是很有可能它不会,并且您将需要编写自己的解析器。 There are various ways you could do this, for instance: 您可以通过多种方式执行此操作,例如:

  • You could split the file into lines and "parse" each line with a regex. 您可以将文件分成几行,然后使用正则表达式“解析”每一行。
  • You could parse the file using a Scanner with appropriate delimiters. 您可以使用带有适当定界符的扫描仪来解析文件。
  • You could use a parser generator to implement a lexer and parser. 您可以使用解析器生成器来实现词法分析器和解析器。
  • You could implement a simple lexer and parser by hand. 您可以手动实现一个简单的词法分析器和解析器。
  • There are probably Groovy specific solutions. 可能有Groovy特定的解决方案。

In reality the correct choice(s) depend on how simple or complex the actual format is. 实际上,正确的选择取决于实际格式的简单程度。 We can't tell that from a single example. 我们不能从一个例子中看出这一点。

For a true INI-format file: What is the easiest way to parse an INI file in Java? 对于真正的INI格式的文件: 用Java解析INI文件的最简单方法是什么?

What you're showing here looks more like JSON than INI format to me. 在我看来,您在这里显示的内容更像是JSON,而不是INI格式。 Perhaps look at JSON parsing libraries. 也许看看JSON解析库。 The truth here is that you're not using an established format, so you probably won't be using an established format parser. 事实是您没有使用已建立的格式,因此您可能不会使用已建立的格式解析器。 Your best bet is probably to refactor the file you're dealing with (if possible) into a well-known format to begin with. 最好的选择是将正在处理的文件(如果可能)重构为众所周知的格式。 Don't try to reinvent the wheel unless you absolutely have to. 除非绝对必要,否则不要尝试重新发明轮子。

There's likely not going to be an out-of-box solution if you're dealing with a non-standard format. 如果要处理非标准格式,可能不会有现成的解决方案。 Here's a few approaches you might want to look into: 您可能需要研究以下几种方法:

  • if the format is simple, write a custom recursive descent parser 如果格式简单,请编写自定义递归下降解析器
  • write a filter to transform your format into INI, JSON, etc. and use existing libraries 编写过滤器以将格式转换为INI,JSON等,并使用现有的库
  • create a groovy DSL that matches your format and execute your file as a groovy script 创建与您的格式匹配的Groovy DSL并以Groovy脚本执行文件
  • use a parser generator tool like antlr or parboiled to create a parser from a language specification 使用antlrparboiled之类的解析器生成器工具根据语言规范创建解析器

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM