简体   繁体   English

使用正则表达式匹配IP地址

[英]Match ip address using regex

I'm trying to match a simplified version of IP addresses (I believe this pattern should match all IP addresses and then some things that aren't IP addresses, but that's not really important.) I'm using this syntax in Python: 我正在尝试匹配IP地址的简化版本(我相信该模式应该与所有IP地址匹配,然后与某些不是IP地址的东西匹配,但这并不重要。)我在Python中使用以下语法:

'([0-9]{1,3}\.){3}[0-9]{1,3}'

This, however, matches "127.", for example. 但是,例如,匹配“ 127.”。 As far as I can tell it's interpreting what I've provided as a list of patterns rather than a single one. 据我所知,它只是将我提供的内容解释为模式列表,而不是单个模式。 What am I missing? 我想念什么?

UPDATE: Yes, sorry everyone, I had a typo. 更新:是的,对不起大家,我有错字。 I fixed it. 我修好了它。

Everyone is saying the pattern as-is works perfectly, but I'm not getting that. 每个人都说按原样运行是完美的,但是我不明白。 Maybe my issue lies elsewhere: 也许我的问题出在其他地方:

        matches = regex.findall(line)
        for match in matches:
            matchList.add(label + match)

If I use the pattern '('\\d{1,3}.\\d{1,3}.\\d{1,3}.\\d{1,3}' instead (same thing, I just repeated, this works perfectly and gives a full IP address. However, if I use the pattern above, it instead gives '195.' 如果我使用模式'('\\ d {1,3}。\\ d {1,3}。\\ d {1,3}。\\ d {1,3}'(同样,我只是重复了一次,可以正常工作并提供完整的IP地址,但是,如果我使用上述模式,则会给出“ 195”。

If I put a paren around this expression to get '((\\d{1,3}.){3}\\d{1,3})', label + match gives me the error 'cannot concatenate string and tuple objects' 如果我在该表达式周围放置括号以获取'(((\\ d {1,3}。){3} \\ d {1,3})',则标签+ match给我错误“无法连接字符串和元组对象”

Quick answer, use this instead: 快速解答,请改用以下方法:

(?:[0-9]{1,3}\.){3}[0-9]{1,3}

Long answer: 长答案:

Using 127.0.0.1 as an example, the regex you posted will only match "0." 以127.0.0.1为例,您发布的正则表达式仅匹配“ 0”。 rather than the full address. 而不是完整地址。 The parentheses you're using creates a matching group, which tells the parser to ensure that the entire pattern is found, but only return a match for what's in the () group, which leaves you with "127.0.0.". 您使用的括号将创建一个匹配组,该组告诉解析器确保找到整个模式,但仅返回()组中的内容的匹配项,从而使您剩下“ 127.0.0”。 Plus regex is greedy by default and will automatically choose the furthest/last possible match. 加上regex默认是贪婪的,它将自动选择最远/最近的匹配项。 So with the {3} after the parentheses acting somewhat like an index in this case, you end up with the third match and therefore "0." 因此,在这种情况下,括号后的{3}有点像索引,结果是第三个匹配项,因此为“ 0”。

A set of parentheses by themselves creates a matching group, but what you want instead is a non-matching group. 一组括号本身会创建一个匹配的组,但是您想要的是一个不匹配的组。 Add a ?: just after the first parenthesis like I showed above to signify this. 在上面显示的第一个括号之后添加一个?:来表明这一点。 That way it will still return a match for the entire line. 这样,它将仍然返回整个行的匹配项。 This should give you the "simplified" regex you're looking for. 这应该为您提供所需的“简化”正则表达式。

Maybe you mistyped something when you posted but when I used your regex as posted, it didn't match "127." 也许您在发布时输入了错误的内容,但是当我在发布时使用您的正则表达式时,它与“ 127”不匹配。 or "127.0.0.1". 或“ 127.0.0.1”。 When I removed the extraneous backslash, it seems to work fine for me 当我删除多余的反斜杠时,对我来说似乎很好

In [22]: re.match(r'([0-9]{1,3}\.){3}[0-9]{1,3}', '127.0.0.1')
Out[22]: <_sre.SRE_Match object at 0x1013de5d0>

In [23]: re.match(r'([0-9]{1,3}\.){3}[0-9]{1,3}', '127.')

try this 尝试这个

quoted from this : 转引自

def is_valid_ipv4(ip):
    """Validates IPv4 addresses.
    """
    pattern = re.compile(r"""
        ^
        (?:
          # Dotted variants:
          (?:
            # Decimal 1-255 (no leading 0's)
            [3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
          |
            0x0*[0-9a-f]{1,2}  # Hexadecimal 0x0 - 0xFF (possible leading 0's)
          |
            0+[1-3]?[0-7]{0,2} # Octal 0 - 0377 (possible leading 0's)
          )
          (?:                  # Repeat 0-3 times, separated by a dot
            \.
            (?:
              [3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
            |
              0x0*[0-9a-f]{1,2}
            |
              0+[1-3]?[0-7]{0,2}
            )
          ){0,3}
        |
          0x0*[0-9a-f]{1,8}    # Hexadecimal notation, 0x0 - 0xffffffff
        |
          0+[0-3]?[0-7]{0,10}  # Octal notation, 0 - 037777777777
        |
          # Decimal notation, 1-4294967295:
          429496729[0-5]|42949672[0-8]\d|4294967[01]\d\d|429496[0-6]\d{3}|
          42949[0-5]\d{4}|4294[0-8]\d{5}|429[0-3]\d{6}|42[0-8]\d{7}|
          4[01]\d{8}|[1-3]\d{0,9}|[4-9]\d{0,8}
        )
        $
    """, re.VERBOSE | re.IGNORECASE)
    return pattern.match(ip) is not None

Is that slash before [0-9] a typo? 是[0-9]错字之前的斜线吗?

If so, if you add parenthesis around the whole expression '(([0-9]{1,3}\\.){3}[0-9]{1,3})' you'll create a capture group that will capture the entire match. 如果是这样,如果在整个表达式'(([0-9]{1,3}\\.){3}[0-9]{1,3})'周围加上括号,您将创建一个捕获组将捕获整个比赛。 Otherwise you're just capturing a part of your string. 否则,您只是捕获字符串的一部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM