简体   繁体   English

python解析:哪种文件格式使用`=>`或如何读取自定义输入文件来决定

[英]python parsing: what file format uses `=>` OR how to read custom input files to dict

When using the zmdp solver from here i came across a funky file format that I haven't seen before, it uses => for assignment. 这里使用zmdp求解器时,我遇到了以前从未见过的时髦文件格式,它使用=>进行赋值。 I wasn't able to find out what format it was from the package documentation (it says it is a "policy" format, but it must be based on something more generic) 我无法从包装文档中找到它的格式(它说这是“策略”格式,但必须基于更通用的格式)

{
  policyType => "MaxPlanesLowerBound",
  numPlanes => 7,
  planes => [
    {
      action => 2,
      numEntries => 3,
      entries => [
        0, 18.7429,
        1, 18.7426,
        2, 21.743
      ]
    },
    ### more entries ###
    {
      action => 3,
      numEntries => 3,
      entries => [
        0, 20.8262,
        1, 20.8261,
        2, 20.8259
      ]
    }
  ]
}

I researched a lot on what would be a straightforward way to parse such files (in Python), and also read this blog post which has a huge variety of options for lexing and parsing (the tools that looked most promising for my example seemed to be parsimonious and parsy ). 我对使用Python解析此类文件的直接方法进行了很多研究,并且还阅读了此博客文章 ,其中提供了多种词法分析功能(对于我的示例来说,最有希望的工具似乎是简约麻木 )。
However, whatever solutions I can think of just feels like I'm re-inventing the wheel, and lexing and parsing seems to be an overkill for what I'm trying to do. 但是,无论我想到什么解决方案,都感觉像是我在重新发明轮子,而词法分析和解析对于我想做的事情似乎是一个过大的杀伤力。
I also found this stackoverflow question which coincidentally seems to also be related to a format that uses => . 我还发现了这个 stackoverflow问题,巧合的是,这似乎也与使用=>的格式有关。 However, being lazy and minimalistic when it comes to code, I don't like the regex solution too much. 但是,在编写代码时比较懒惰和简约,我不太喜欢正则表达式解决方案。 My gut feeling tells me that there must be a 3-4 line solution to write the input file to a python dict or similarly useful format. 我的直觉告诉我,必须有3-4行的解决方案才能将输入文件写入python dict或类似有用的格式。 In particular, I suspect that this is already standard syntax of some format I just am not aware of (it's obviously not csv, json, yaml or xml) 特别是,我怀疑这已经是我不知道的某种格式的标准语法(显然不是csv,json,yaml或xml)

The question therefore is: Is the above a standard file format, and if yes, what is it? 因此, 问题是:上面是标准文件格式吗?如果是,它是什么?
If not, how do I parse this file elegantly and compactly in Python3, ie without regexing for every keyword? 如果没有,如何在Python3中优雅而紧凑地解析此文件,即不对每个关键字进行正则表达式?

I don't see any differences from json here aside from replacing '=>' with ':' and adding a top level key. 除了将'=>'替换为':'并添加顶级密钥外,我没有看到与json有任何区别。

filestr.replace(‘=>’, ‘:’)
dictionary = json.loads(filestr)

Edited after seeing comment above. 看到上面的评论后进行了编辑。

Unquoted keys are indeed not part of the json standard. 没有引号的键确实不是json标准的一部分。 To address that, you can use a library as described here or you can regex it. 为了解决这个问题,您可以按照此处所述使用库,也可以对其进行正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM