简体   繁体   中英

Regex: how to extract only first IP address from string (in Python)

Given the following string (or similar strings, some of which may contain more than one IP address):

from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03

I wish to extract the first and only the first IP address, in Python. A first attempt with something like ([0-9]{2,}\\.){3}([0-9]{2,}){1} when tried out on nregex.com, looks almost OK, matching the IP address fine, but also matches the other substring which roughly resembles an IP address (170.2015.06.03.14.12.03). When the same pattern is passed to re.compile/re.findall though, the result is:

[(u'243.', u'70'), (u'06.', u'03')]

So clearly the regex is no good. How can I improve it so that it's neater and catches all IPV4 address, and how can I make it such that it only matches the first?

Many thanks.

Use re.search with the following pattern:

>>> s = 'from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03'
>>> import re
>>> re.search(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', s).group()
'208.83.243.70'

The regex you want is r'(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})' . This catches 4 1- to 4-digit numbers separated by dots.

If the IP number always comes before other numbers in the string, you can avoid selecting it by using a non-greedy function such as re.find . In contrast, re.findall will catch both 208.83.243.70 and 015.06.03.14 .

Are you OK with using the brackets to single out the IP number? if so, you can change the regex to r'\\[(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\]' . It would be safer that way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM