Given for example a string like this:
random word, random characters##?, some dots. username bob.1234 other stuff
I'm currently using this regex to capture the username (bob.1234):
\busername (.+?)(,| |$)
But my code needs a regex with only one capture group as python's re.findall returns something different when there are multiple capture groups. Something like this would almost work, except it will capture the username "bob" instead of "bob.1234":
\busername (.+?)\b
Anybody knows if there is a way to use the word boundary while ignoring the dot and without using more than one capture group?
NOTES:
The \\busername (.+?)(,| |$)
pattern contains 2 capturing groups, and re.findall
will return a list of tuples once a match is found. See findall
reference :
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
So, there are three approaches here:
(?:...)
non-capturing group rather than the capturing one: re.findall(r'\\busername (.+?)(?:,| |$)', s)
. It will consume a ,
or space, but since only captured part will be returned and no overlapping matches are expected, it is OK.re.findall(r'\\busername (.+?)(?=,| |$)', s)
. The space and comma will not be consumed, that is the only difference from the first approach.(.+?)(,| |$)
into a simple negated character class [^ ,]+
that matches one or more chars other than a space or comma. It will match till end of string if there are no ,
or space after username
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.