[英]How to parse a string with Pyparsing which involves ignoring exceptions and moving onto the next delimiter?
I have a program that requires a list of effects, followed by start time, and end time.我有一个程序需要效果列表,然后是开始时间和结束时间。 So I have this string that you acquire from user input (it can be faulty), and I'm trying to parse the relevant information and ignore faulty information, while moving to the next effect, after each ";".
所以我有你从用户输入获得的这个字符串(它可能是错误的),我试图解析相关信息并忽略错误信息,同时在每个“;”之后移动到下一个效果。 However I'm not quite sure how to use the Pyparsing library to do this, and I'm wondering whether this can be done purely with the library.
但是我不太确定如何使用 Pyparsing 库来做到这一点,我想知道这是否可以纯粹用库来完成。 The comments within the code signify what it should return, and the output below is what it actually returns.
代码中的注释表示它应该返回的内容,下面的 output 是它实际返回的内容。
import pyparsing as pp
testcase = "bounce, 5, 10; shutter, 12, 14" # returns [[bounce, 5, 10], [shutter, 12, 14]]
testcase2= "bounce, 5, 10; shutter, 12, 14; low_effort, 2, 23" # returns [[bounce, 5, 10], [shutter, 12, 14], [low_effort, 2, 23]]
testcase3= "_lolw, a, 2; effect, 6;" # returns [[effect, 6, None]]
testcase4= "bounce, 1, 10; effect, 5, a; bounce, 2, 10" # returns [[bounce, 1, 10], [bounce, 2, 10]]
testcase5= ";;;effect, 10; bounce, a, 1; bounce, 3, 10" # returns [[effect, 10, None], [bounce, 3, 10]]
testcase6= "effect, b, a; 9, 10, 11; max9, 10, 11; here, 2, 3; !b, 1, 2;;;" # returns [[here, 2, 3]]
def parseKeyframes(string: str):
comma = pp.Suppress(",")
pattern = pp.Word(pp.alphas + "_") + comma + pp.Word(pp.nums) + pp.Optional(comma + pp.Word(pp.nums), default=None)
# parse pattern seperated by ";"
pattern = pattern | pp.SkipTo(pp.Literal(";"))
parsed = pp.delimitedList(pp.Group(pattern), ";")
parsed = parsed.parseString(string)
return parsed
print(parseKeyframes(testcase))
print(parseKeyframes(testcase2))
print(parseKeyframes(testcase3))
print(parseKeyframes(testcase4))
print(parseKeyframes(testcase5))
print(parseKeyframes(testcase6))
Output: Output:
[['bounce', '5', '10'], ['shutter', '12', '14']]
[['bounce', '5', '10'], ['shutter', '12', '14'], ['low_effort', '2', '23']]
[['_lolw, a, 2'], ['effect', '6', None]]
[['bounce', '1', '10'], ['effect', '5', None]]
[[''], [''], [''], ['effect', '10', None], ['bounce, a, 1'], ['bounce', '3', '10']]
[['effect, b, a'], ['9, 10, 11'], ['max9, 10, 11'], ['here', '2', '3'], ['!b, 1, 2'], [''], ['']]
You have the right idea, just need to add a few additional terms, and suppress the errors.你有正确的想法,只需要添加一些额外的术语,并抑制错误。
'effect, 5, a'
), you must add a FollowedBy term so that a match is followed by either a ';''effect, 5, a'
)的情况,您必须添加一个 FollowedBy 术语,以便匹配后跟一个 ';' or the end of the string.Since ";" | end_of_string
由于
";" | end_of_string
";" | end_of_string
";" | end_of_string
occurs twice for very similar purposes, I created a next_delim
term for that. ";" | end_of_string
出于非常相似的目的出现了两次,我为此创建了一个next_delim
术语。 This makes the lookaheads a little clearer.这使前瞻更加清晰。 I also defined
word
and integer
terms, which made pattern
easier for me to follow.我还定义了
word
和integer
术语,这使我更容易理解pattern
。
I think this slightly modified version of your parser will give you your expected results:我认为您的解析器的这个稍微修改过的版本会给您预期的结果:
comma = pp.Suppress(",")
# ';'-delimited list of word , int [, int]
word = pp.Word(pp.alphas + "_")
integer = pp.Word(pp.nums)
pattern = (word
+ comma
+ integer
+ pp.Optional(comma + integer, default=None))
# parse pattern seperated by ";"
next_delim = ";" | pp.StringEnd()
skip_invalid = pp.SkipTo(next_delim).suppress()
pattern_or_skip = (pp.Group(pattern + pp.FollowedBy(next_delim))
| skip_invalid)
parser = pp.delimitedList(pattern_or_skip, delim=";")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.