简体   繁体   English

python pyparsing字符串:A,b,b,b,A,b,b,b * [关闭]

[英]python pyparsing string: A,b,b,b,A,b,b,b* [closed]

I have a crazy problem. 我有一个疯狂的问题。

I am trying to use pyparsing to parse something like this: (dots are unimportant suppressed text) 我正在尝试使用pyparsing来解析如下内容:(点是不重要的抑制文本)

...... A
B .......
B .......
...... A
B .......
B .......

What I need is something like this: (to connect A and B element into one list) 我需要的是这样的:(将A和B元素连接到一个列表中)

 [ [ [A],[B,B] ], [ [A],[B,B] ] , ...]

This is my code which does not work and only returns the first [A] 这是我的代码,不起作用,仅返回第一个[A]

table = pyparsing.OneOrMore(pyparsing.Group(A + (pyparsing.OneOrMore(pyparsing.Group(B) | pyparsing.SkipTo(B).suppress()))) | pyparsing.SkipTo(A).suppress())

I have already solved this pyparsing into a list like this: 我已经解决了这个pyparsing成这样的列表:

[ [A],[B],[B], [A],[B],[B] , ...]

But this was not acceptable, since the A and B elements were not directly connected in one list. 但这是不可接受的,因为A和B元素未在一个列表中直接连接。

Worth mentioning is that 值得一提的是

table1 = pyparsing.ZeroOrMore(pyparsing.Group(A) | pyparsing.SkipTo(A).suppress())
table2 = pyparsing.ZeroOrMore(pyparsing.Group(B) | pyparsing.SkipTo(B).suppress())

works and return a list of all A elements and B elements. 工作并返回所有A元素和B元素的列表。

Yes, you can have OneOrMore's embedded within other OneOrMore's - it would severely limit the parsers you could write if you couldn't. 是的,您可以将OneOrMore嵌入到其他OneOrMore的内部-如果不能这样做,将严重限制您可以编写的解析器。

I think you might be able to adapt your existing solution if you do better grouping. 我认为,如果您进行更好的分组,您也许可以适应现有的解决方案。 See how the Groups are defined in this toy example: 查看此玩具示例中如何定义组:

test = """
...... A 
B ....... 
B ....... 
...... A 
B ....... 
B ......."""

from pyparsing import Literal, Word, printables, Group, OneOrMore

A = Literal("A")
B = Literal("B")

notAorB = Word(printables, excludeChars="AB")

parser = OneOrMore(Group(A + Group(OneOrMore(B))))
parser.ignore(notAorB)

print parser.parseString(test).asList()

Prints: 印刷品:

[['A', ['B', 'B']], ['A', ['B', 'B']]]

I think you have two options: 我认为您有两种选择:

  • rethink the way you're parsing the text 重新思考您解析文本的方式
  • stay happy with what you have and just clean things afterwards: You have a list like your_list=[A,B,B,A,B,B] ? 对自己拥有的东西保持满意,然后事后清理:您有一个诸如your_list=[A,B,B,A,B,B] You can just do 你可以做

     [ [x[i], x[i+1] + x[i+2]] for i in range(len(your_list)//3)] 

    the + will concatenate your two [B] lists (represented as x[i+1] and x[i+2] ). +将连接您的两个[B]列表(分别表示为x[i+1]x[i+2] )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM