简体   繁体   中英

Regex: Matching individual characters without matching characters inbetween

I have a simple regex query.

Here is the input:

DLWLALDYVASQASV

The desired output are the positions of the bolded characters. DLWLAL DY VA S QASV

So it would be D:6, Y:7, S:10.

I am using python, so I know I can use span() or start() to obtain the start positions of a match. But if I try to use something like: DY.{2}S It will match the characters in between and only give me the position of the first (and last in the case of span) character of the match.

Is there a function or a way to retrieve the position of each specified character, not including the characters in-between?

match = re.search(r'(D)(Y)..(S)', 'DLWLALDYVASQASV')
print([match.group(i) for i in range(4)])
>>> ['DYVAS', 'D', 'Y', 'S']
print([match.span(i) for i in range(4)])
>>> [(6, 11), (6, 7), (7, 8), (10, 11)]
print([match.start(i) for i in range(4)])
>>> [6, 6, 7, 10]

You can take subexpressions of regular expression into brackets and then access the corresponding substrings via the match object. See the documentation of Match object for more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM