简体   繁体   English

使用 pyparsing 进行非贪婪解析

[英]Non greedy parsing with pyparsing

I'm trying to parse a line with pyparsing.我正在尝试用 pyparsing 解析一行。 This line is composed of a number of (key, values).该行由许多(键,值)组成。 What I'd like to get is a list of (key, values).我想得到的是(键,值)的列表。 A simple example:一个简单的例子:

ids = 12 fields = name

should result in something like: [('ids', '12'), ('fields', 'name')]应该导致类似: [('ids', '12'), ('fields', 'name')]

A more complex example:一个更复杂的例子:

ids = 12, 13, 14 fields = name, title

should result in something like: [('ids', '12, 13, 14'), ('fields', 'name, title')]应该导致类似: [('ids', '12, 13, 14'), ('fields', 'name, title')]

PS: the tuple inside the resulting list is just an example. PS:结果列表中的元组只是一个例子。 It could be a dict or another list or whatever, it's not that important.它可能是一个字典或另一个列表或其他任何东西,这并不重要。

But whatever I've tried up to now I get results like: [('ids', '12 fields')]但是到目前为止,无论我尝试过什么,我都会得到如下结果: [('ids', '12 fields')]

Pyparsing is eating the next key, considering it's also part of the value.考虑到它也是价值的一部分,Pyparsing 正在吃下一个密钥。

Here is a sample code:这是一个示例代码:

import pyparsing as P

key = P.oneOf("ids fields")
equal = P.Literal('=')
key_equal = key + equal
val = ~key_equal + P.Word(P.alphanums+', ')

gr = P.Group(key_equal+val)
print gr.parseString("ids = 12 fields = name")

Can someone help me?有人能帮我吗? Thanks.谢谢。

The first problem lies in this line:第一个问题在于这一行:

val = ~key_equal + P.Word(P.alphanums+', ')

It suggests that the part matches any alphanumeric sequence, followed by the literal ', ' , but instead it matches any sequence of alphanumeric characters, ',' and ' ' .它表明该部分匹配任何字母数字序列,后跟文字', ' ,但它匹配任何字母数字字符序列','' '

What you'd want instead is:你想要的是:

val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)

The second problem is that you only parse one key-value pair:第二个问题是你只解析一个键值对:

gr = P.Group(key_equal+val)

Instead, you should parse as many as possible:相反,您应该尽可能多地解析:

gr = P.Group(P.OneOrMore(key_equal+val))

So the correct solution is:所以正确的解决方案是:

>>> import pyparsing as P
>>> key = P.oneOf("ids fields")
>>> equal = P.Literal('=')
>>> key_equal = key + equal
>>> val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)
>>> gr = P.OneOrMore(P.Group(key_equal+val))
>>> print gr.parseString("ids = 12, 13, 14 fields = name, title")
[['ids', '=', '12, 13, 14'], ['fields', '=', 'name, title']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM