[英]How to parse a file in INI/JSON-like non-standard format?
Suppose I have a text file in the following (non-standard) format: 假设我有以下(非标准)格式的文本文件:
xxx { a = v1; b = v2 } yyy { a = v3; c = v4 }
I cannot change it to any standard (INI/XML/YAML, etc.) format. 我无法将其更改为任何标准(INI / XML / YAML等)格式。
Now I would like to find the value of property a
in section xxx
(that is v1
). 现在,我想在
xxx
节(即v1
)中找到属性a
的值。 What is the simplest way to do it in Java/Groovy? 在Java / Groovy中最简单的方法是什么?
With Groovy, you could leverage the ConfigSlurper. 使用Groovy,您可以利用ConfigSlurper。
However, you would first need to hack a map of valid values together, so that it doesn't choke trying to work out what v1, v2, v3, etc
are: 但是,您首先需要将有效值映射在一起,以免试图找出
v1, v2, v3, etc
是什么:
This seems to work: 这似乎可行:
def input = '''xxx { a = v1; b = v2 }
|yyy { a = v3; c = v4 }'''.stripMargin()
def slurper = new ConfigSlurper()
// Find all words 'w' and make a map of [ w1:'w1', w2:'w2', ... ]
slurper.binding = ( ( input =~ /\w+/ ) as List ).collectEntries { w -> [ (w):w ] }
def result = slurper.parse( input )
println result
That prints out: 打印出来:
[xxx:[a:v1, b:v2], yyy:[a:v3, c:v4]]
(Groovy 1.8.4) (Groovy 1.8.4)
Firstly, you've given an example, not specified a format. 首先,您给出了一个示例,未指定格式。 Before you go any further, you need to get hold of a complete specification for the format.
在继续之前,您需要掌握该格式的完整规范。 Or if there isn't one, you need to see the code that generates it, and reverse engineer a specification.
或者,如果没有,则需要查看生成它的代码,并对规范进行反向工程。
(If you try to implement based on a small example, there's a good chance that your parser will encounter real life examples that don't fit the patterns that you have intuited.) (如果您尝试根据一个小例子来实现,那么解析器很有可能会遇到不符合您直觉的模式的现实例子。)
Having done that you can look for an off-the-shelf parser that can cope with your format. 完成后,您可以寻找可以处理您的格式的现成解析器。 If you are lucky, it might be close enough to INI, or JSON or YAML or something else for the corresponding parser to (mostly) work.
如果幸运的话,它可能与INI,JSON或YAML或其他足够接近的解析器足够接近(主要)起作用。
But the chances are that it won't, and that you will need to write your own parser. 但是很有可能它不会,并且您将需要编写自己的解析器。 There are various ways you could do this, for instance:
您可以通过多种方式执行此操作,例如:
In reality the correct choice(s) depend on how simple or complex the actual format is. 实际上,正确的选择取决于实际格式的简单程度。 We can't tell that from a single example.
我们不能从一个例子中看出这一点。
For a true INI-format file: What is the easiest way to parse an INI file in Java? 对于真正的INI格式的文件: 用Java解析INI文件的最简单方法是什么?
What you're showing here looks more like JSON than INI format to me. 在我看来,您在这里显示的内容更像是JSON,而不是INI格式。 Perhaps look at JSON parsing libraries.
也许看看JSON解析库。 The truth here is that you're not using an established format, so you probably won't be using an established format parser.
事实是您没有使用已建立的格式,因此您可能不会使用已建立的格式解析器。 Your best bet is probably to refactor the file you're dealing with (if possible) into a well-known format to begin with.
最好的选择是将正在处理的文件(如果可能)重构为众所周知的格式。 Don't try to reinvent the wheel unless you absolutely have to.
除非绝对必要,否则不要尝试重新发明轮子。
There's likely not going to be an out-of-box solution if you're dealing with a non-standard format. 如果要处理非标准格式,可能不会有现成的解决方案。 Here's a few approaches you might want to look into:
您可能需要研究以下几种方法:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.