简体   繁体   English

pyparsing:示例JSON解析器因dicts列表而失败

[英]pyparsing: example JSON parser fails for list of dicts

All, 所有,

I'm trying to understand how to handle a list of Dicts using pyparsing. 我试图了解如何使用pyparsing处理Dicts列表。 I've gone back to the example JSON parser for best practices but I've found that it can't handle a list of dicts either! 我已经回到了示例JSON解析器的最佳实践,但我发现它也无法处理一个dicts列表!

Consider the following (this is the stock example JSON parser, but with some comments removed and my test case instead of the default one): 请考虑以下(这是库存示例JSON解析器,但删除了一些注释,而我的测试用例而不是默认值):

#!/usr/bin/env python2.7

from pyparsing import *

TRUE = Keyword("true").setParseAction( replaceWith(True) )
FALSE = Keyword("false").setParseAction( replaceWith(False) )
NULL = Keyword("null").setParseAction( replaceWith(None) )

jsonString = dblQuotedString.setParseAction( removeQuotes )
jsonNumber = Combine( Optional('-') + ( '0' | Word('123456789',nums) ) +
                    Optional( '.' + Word(nums) ) +
                    Optional( Word('eE',exact=1) + Word(nums+'+-',nums) ) )

jsonObject = Forward()
jsonValue = Forward()
jsonElements = delimitedList( jsonValue )
jsonArray = Group(Suppress('[') + Optional(jsonElements) + Suppress(']') )
jsonValue << ( jsonString | jsonNumber | Group(jsonObject)  | jsonArray | TRUE | FALSE | NULL )
memberDef = Group( jsonString + Suppress(':') + jsonValue )
jsonMembers = delimitedList( memberDef )
jsonObject << Dict( Suppress('{') + Optional(jsonMembers) + Suppress('}') )

jsonComment = cppStyleComment
jsonObject.ignore( jsonComment )

def convertNumbers(s,l,toks):
    n = toks[0]
    try:
        return int(n)
    except ValueError, ve:
        return float(n)

jsonNumber.setParseAction( convertNumbers )

if __name__ == "__main__":
    testdata = """
[ { "foo": "bar", "baz": "bar2" },
  { "foo": "bob", "baz": "fez" } ]
    """
    results = jsonValue.parseString(testdata)
    print "[0]:", results[0].dump()
    print "[1]:", results[1].dump()

This is valid JSON, but the pyparsing example fails when trying to index into the second expected array element: 这是有效的JSON,但是当尝试索引到第二个预期的数组元素时,pyparsing示例失败:

[0]: [[['foo', 'bar'], ['baz', 'bar2']], [['foo', 'bob'], ['baz', 'fez']]]
[1]:
Traceback (most recent call last):
  File "json2.py", line 42, in <module>
    print "[1]:", results[1].dump()
  File "/Library/Python/2.7/site-packages/pyparsing.py", line 317, in __getitem__
    return self.__toklist[i]
IndexError: list index out of range

Can anyone help me in identifying what's wrong with this grammar? 任何人都可以帮我识别这个语法有什么问题吗?

EDIT : Fixed bug in trying to parse as JSON Object, not value. 编辑 :修复了尝试解析为JSON对象而不是值的错误。

Note: This is related to: pyparsing: grammar for list of Dictionaries (erlang) where I'm basically trying to do the same with an Erlang data structure, and failing in a similiar way :( 注意:这与: pyparsing:字典列表(erlang)的语法 ,我基本上试图用Erlang数据结构做同样的事情,并以类似的方式失败:(

This may be valid JSON , but your grammar won't handle it. 这可能是有效的JSON ,但你的语法不会处理它。 Here's why: 原因如下:

jsonObject << Dict( Suppress('{') + Optional(jsonMembers) + Suppress('}') )

This says the grammar object must be surrounded by {...} . 这表示语法对象必须被 {...} 包围 You are bracing it as an array [...] . 你支持它作为阵列[...] Since the top-level object must be a dictionary, it will need key names. 由于顶级对象必须是字典,因此需要键名。 Changing your test data to: 将测试数据更改为:

{ "col1":{ "foo": "bar", "baz": "bar2" },
  "col2":{ "foo": "bob", "baz": "fez" } }

or 要么

{ "data":[{ "foo": "bar", "baz": "bar2" },
          { "foo": "bob", "baz": "fez" }] }

will allow this grammar to parse it. 将允许此语法解析它。 Want a top-level object to be an array? 想要一个顶级对象成为一个数组? Just modify the grammar! 只需修改语法!

The parse results object that you get back from this expression is a list of the matched tokens - pyparsing doesn't know if you are going to match one or many tokens, so it returns a list, in your case of list containing 1 element, the array of dicts. 从此表达式返回的解析结果对象是匹配的标记的列表 - pyparsing不知道您是否要匹配一个或多个标记,因此它返回一个列表,在包含1个元素的列表的情况下,一系列的词汇。

Change 更改

results = jsonValue.parseString(testdata)

to

results = jsonValue.parseString(testdata)[0]

and I think things will start to look better. 而且我觉得事情会变得更好看。 After doing this, I get: 这样做之后,我得到:

[0]: [['foo', 'bar'], ['baz', 'bar2']]
- baz: bar2
- foo: bar
[1]: [['foo', 'bob'], ['baz', 'fez']]
- baz: fez
- foo: bob

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM