简体   繁体   中英

Matching an object and a specific regex with Python

Given a text, I need to check for each char if it has 3 capital letters on both sides and if there are, add it to a string of such characters that is retured. 3个大写字母,如果有,请将其添加到一个这样显示的字符串中。

I wrote the following: m = re.match("[AZ]{3}.[AZ]{3}", text) (let's say text="AAAbAAAcAAA")

I expected to get two groups in the match object: "AAAbAAA" and "AAAcAAA"

Now, When i invoke m.group(0) I get "AAAbAAA" which is right. Yet, when invoking m.group(1) , I find that there is no such group, meaning "AAAcAAA" wasn't a match. Why?

Also, when invoking m.groups() , I get an empty tuple although I should get a tuple of the matches, meaning that in my case I should have gotten a tuple with "AAAbAAA". Why doesn't that work?

You don't have any groups in your pattern. To capture something in a group, you have to surround it with parentheses:


The exception is m.group(0) , which will always contain the entire match.

Looking over your question, it sounds like you aren't actually looking for capture groups, but rather overlapping matches. In regex, a group means a smaller part of the match that is set aside for later use. For example, if you're trying to match phone numbers with something like


then the area code would be in group(1) , the local part in group(2) , and the entire thing would be in group(0) .

What you want is to find overlapping matches. Here's a Stack Overflow answer that explains how to do overlapping matches in Python regex , and here's my favorite reference for capture groups and regex in general.

One, you are using match when it looks like you want findall . It won't grab the enclosing capital triplets, but re.findall('[AZ]{3}([az])(?=[AZ]{3})', search_string) will get you all single lower case characters surrounded on both sides by 3 caps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM