简体   繁体   English

Python正则表达式字符模式组

[英]Python regex char pattern group

I'm trying to do a regex pattern to match all groups of A.. in a string until the next A. (Python) 我试图做一个正则表达式模式来匹配字符串中所有A ..组,直到下一个A。(Python)

For example: DFDAXDJSDSJDAFGCJASDJASAGXCJAD into: 例如:DFDAXDJSDSJDAFGCJASDJASAGXCJAD到:

'AXDJSDSJD'
'AFGCJ'
'ASDJ'
'AS'
'AGXCJ'
'AD'

The closest thing I came up with was: 我想到的最接近的是:

string="DFDAXDJSDSJDAFGCJASDJASAGXCJAD"
r=re.compile('(A.[!=A]*)+')
matchObj = r.findall(string, re.M|re.I)

which returns AF, AS, ASA, AD 返回AF, AS, ASA, AD

Why does it skip the first one? 为什么跳过第一个? Why doesn't it return all chars until the next A? 为什么它不返回所有字符直到下一个A?

You could just split the string on A : 您可以在A拆分字符串:

>>> s = "DFDAXDJSDSJDAFGCJASDJASAGXCJAD"
>>> s.split('A')
['DFD', 'XDJSDSJD', 'FGCJ', 'SDJ', 'S', 'GXCJ', 'D']

# add a leading `A` to each match 'on the fly'
>>> [ 'A%s' % s for s in  s.split('A') ]
['ADFD', 'AXDJSDSJD', 'AFGCJ', 'ASDJ', 'AS', 'AGXCJ', 'AD']

Or use an optional positive lookahead : 或使用可选的正向前行

>>> re.findall('(A[^A]+(?=A)?)', s, re.IGNORECASE | re.MULTILINE)
['AXDJSDSJD', 'AFGCJ', 'ASDJ', 'AS', 'AGXCJ', 'AD']

Or simply (if you do not care about some next A - which is equivalent to saying that it is optional): 或者简单地(如果您不关心下一个A-相当于说它是可选的):

>>> re.findall('(A[^A]+)', s, re.IGNORECASE | re.MULTILINE)
['AXDJSDSJD', 'AFGCJ', 'ASDJ', 'AS', 'AGXCJ', 'AD']

I can propose following method: 我可以提出以下方法:

string="DFDAXDJSDSJDAddaFGCJASDJASAGXCJAD"
r=re.compile('A[^A]*', re.I|re.M)
matchObj = r.findall(string)
matchObj

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM