[英]Non greedy parsing with pyparsing
I'm trying to parse a line with pyparsing.我正在尝试用 pyparsing 解析一行。 This line is composed of a number of (key, values).
该行由许多(键,值)组成。 What I'd like to get is a list of (key, values).
我想得到的是(键,值)的列表。 A simple example:
一个简单的例子:
ids = 12 fields = name
should result in something like: [('ids', '12'), ('fields', 'name')]
应该导致类似:
[('ids', '12'), ('fields', 'name')]
A more complex example:一个更复杂的例子:
ids = 12, 13, 14 fields = name, title
should result in something like: [('ids', '12, 13, 14'), ('fields', 'name, title')]
应该导致类似:
[('ids', '12, 13, 14'), ('fields', 'name, title')]
PS: the tuple inside the resulting list is just an example. PS:结果列表中的元组只是一个例子。 It could be a dict or another list or whatever, it's not that important.
它可能是一个字典或另一个列表或其他任何东西,这并不重要。
But whatever I've tried up to now I get results like: [('ids', '12 fields')]
但是到目前为止,无论我尝试过什么,我都会得到如下结果:
[('ids', '12 fields')]
Pyparsing is eating the next key, considering it's also part of the value.考虑到它也是价值的一部分,Pyparsing 正在吃下一个密钥。
Here is a sample code:这是一个示例代码:
import pyparsing as P
key = P.oneOf("ids fields")
equal = P.Literal('=')
key_equal = key + equal
val = ~key_equal + P.Word(P.alphanums+', ')
gr = P.Group(key_equal+val)
print gr.parseString("ids = 12 fields = name")
Can someone help me?有人能帮我吗? Thanks.
谢谢。
The first problem lies in this line:第一个问题在于这一行:
val = ~key_equal + P.Word(P.alphanums+', ')
It suggests that the part matches any alphanumeric sequence, followed by the literal ', '
, but instead it matches any sequence of alphanumeric characters, ','
and ' '
.它表明该部分匹配任何字母数字序列,后跟文字
', '
,但它匹配任何字母数字字符序列','
和' '
。
What you'd want instead is:你想要的是:
val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)
The second problem is that you only parse one key-value pair:第二个问题是你只解析一个键值对:
gr = P.Group(key_equal+val)
Instead, you should parse as many as possible:相反,您应该尽可能多地解析:
gr = P.Group(P.OneOrMore(key_equal+val))
So the correct solution is:所以正确的解决方案是:
>>> import pyparsing as P
>>> key = P.oneOf("ids fields")
>>> equal = P.Literal('=')
>>> key_equal = key + equal
>>> val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)
>>> gr = P.OneOrMore(P.Group(key_equal+val))
>>> print gr.parseString("ids = 12, 13, 14 fields = name, title")
[['ids', '=', '12, 13, 14'], ['fields', '=', 'name, title']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.