简体   繁体   中英

Python regular expressions - re.search() vs re.findall()

For school I'm supposed to write a Python RE script that extracts IP addresses. The regular expression I'm using seems to work with re.search() but not with re.findall() .

exp = "(\d{1,3}\.){3}\d{1,3}"
ip = "blah blah 192.168.0.185 blah blah"
match = re.search(exp, ip)
print match.group()

The match for that is always 192.168.0.185, but its different when I do re.findall()

exp = "(\d{1,3}\.){3}\d{1,3}"
ip = "blah blah 192.168.0.185 blah blah"
matches = re.findall(exp, ip)
print matches[0]

0.

I'm wondering why re.findall() yields 0. when re.search() yields 192.168.0.185, since I'm using the same expression for both functions.

And what can I do to make it so re.findall() will actually follow the expression correctly? Or am I making some kind of mistake?

findall returns a list of matches, and from the documentation:

If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.

So, your previous expression had one group that matched 3 times in the string where the last match was 0.

To fix your problem use: exp = "(?:\\d{1,3}\\.){3}\\d{1,3}" ; by using the non-grouping version, there is no returned groups so the match is returned in both cases.

You're only capturing the 0 in that regex, as it'll be the last one that's caught.

Change the expression to capture the entire IP, and the repeated part to be a non-capturing group:

In [2]: ip = "blah blah 192.168.0.185 blah blah"

In [3]: exp = "((?:\d{1,3}\.){3}\d{1,3})"

In [4]: m = re.findall(exp, ip)

In [5]: m
Out[5]: ['192.168.0.185']

In [6]: 

And if it helps to explain the regex:

In [6]: re.compile(exp, re.DEBUG)
subpattern 1
  max_repeat 3 3
    subpattern None
      max_repeat 1 3
        in
          category category_digit
      literal 46
  max_repeat 1 3
    in
      category category_digit

This explains the subpatterns. Subpattern 1 is what gets captured by findall.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM