简体   繁体   English

Python解析IP正则表达式

[英]Python parse ip regex

I want to be able to parse something like "10.[3-25].0.X" into the actual list of ip addresses described by this rule, so for the above example rule the list would be [10.3.0.0, 10.3.0.1....10.25.0.255]. 我希望能够将类似“ 10. [3-25] .0.X”的内容解析到此规则描述的实际IP地址列表中,因此对于上述示例规则,该列表将为[10.3.0.0,10.3 .0.1 .... 10.25.0.255]。 What's the best way to do it? 最好的方法是什么? So far the only thing I was able to come out with is the following awful-looking function: 到目前为止,我唯一能想到的就是以下可怕的功能:

wc = ''.join(wc.split()).upper()
wc = re.sub(r'(?<![\[-])(\d+)(?![\]-])', r'[\1-\1]', wc)
wc = re.sub(r'X', r'[0-255]', wc).split('.')
ips = []
for i in range(int(re.findall(r'(\d+)-(\d+)', wc[0])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[0])[0][1]) + 1): 

    for j in range(int(re.findall(r'(\d+)-(\d+)', wc[1])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[1])[0][1]) + 1): 

        for k in range(int(re.findall(r'(\d+)-(\d+)', wc[2])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[2])[0][1]) + 1):

            for p in range(int(re.findall(r'(\d+)-(\d+)', wc[3])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[3])[0][1]) + 1):

                ips.append(str(i) + '.' + str(j) + '.' + str(k) + '.' + str(p))

return ips

Any improvement ideas would be greatly appreciated. 任何改进的想法将不胜感激。

Here's a possible example using itertools.product . 这是使用itertools.product的可能示例。 The idea is to first evaluate the "template" (eg 1.5.123.2-5, 23.10-20.X.12, ...) octet by octet (each yielding a list of values) and then take the cartesian product of those lists. 这个想法是首先按八位位组评估“模板”八位位组(例如1.5.123.2-5、23.10-20.X.12等),然后生成这些值的笛卡尔乘积。

import itertools
import re
import sys

def octet(s):
    """
    Takes a string which represents a single octet template.
    Returns a list of values. Basic sanity checks.
    """
    if s == 'X':
        return xrange(256)
    try:
        low, high = [int(val) for val in s.strip('[]').split('-')]
        if low > high or low < 0 or high > 255:
            raise RuntimeError('That is no valid range.')
        return xrange(low, high + 1)
    except ValueError as err:
        number = int(s)
        if not 0 <= number <= 255:
            raise ValueError('Only 0-255 allowed.')
        return [number]

if __name__ == '__main__':
    try:
        template = sys.argv[1]
        octets = [octet(s) for s in template.split('.')]
        for parts in itertools.product(*octets):
            print('.'.join(map(str, parts)))
    except IndexError as err:
        print('Usage: %s IP-TEMPLATE' % (sys.argv[0]))
        sys.exit(1)

(Small) Examples: (小)示例:

$ python ipregex.py '1.5.123.[2-5]'
1.5.123.2
1.5.123.3
1.5.123.4
1.5.123.5

$ python ipregex.py '23.[19-20].[200-240].X'
23.19.200.0
23.19.200.1
23.19.200.2
...
23.20.240.253
23.20.240.254
23.20.240.255   

You could make this a lot simpler. 您可以使它简单得多。

First, instead of writing the exact same thing four times, use a loop or a listcomp: 首先,使用四次循环或一个listcomp而不是四次编写完全相同的东西:

ranges = [range(int(re.findall(r'(\d+)-(\d+)', wc[i])[0][0]), 
                int(re.findall(r'(\d+)-(\d+)', wc[i])[0][1]) + 1)
          for i in range(4)]

You can also turn the nested loop into a flat loop over the cartesian product: 您还可以将嵌套循环变成笛卡尔乘积上的平面循环:

for i, j, k, p in itertools.product(*ranges):

And you can turn that long string-concatenation mess into a simple format or join call: 您可以将长字符串连接混乱变成简单的格式或加入呼叫:

ips.append('{}.{}.{}.{}'.format(i, j, k, p)) # OR
ips.append('.'.join(map(str, (i, j, k, p))))

And that means you don't need to split out the 4 components in the first place: 这意味着您无需首先拆分4个组件:

for components in itertools.product(*ranges):
    ips.append('{}.{}.{}.{}'.format(*components)) # OR
    ips.append('.'.join(map(str, components)))

And now that the loop is so trivial, you can turn it into a listcomp: 现在,循环变得如此简单,您可以将其转换为listcomp:

ips = ['{}.{}.{}.{}'.format(*components)
       for components in itertools.product(*ranges)]

ip= re.search(r'(\\d{1,3}.){3}\\d{1,3}','192.168.1.100') print(ip.group()) ip = re.search(r'(\\ d {1,3}。){3} \\ d {1,3}','192.168.1.100')print(ip.group())

o/p==>192.168.1.100 o / p ==> 192.168.1.100

case:2 ips= re.findall(r'(\\d{1,3}.){3}\\d{1,3}','192.168.1.100') print(ips) case:2 ips = re.findall(r'(\\ d {1,3}。){3} \\ d {1,3}','192.168.1.100')print(ips)

o/p==> ['1.'] o / p ==> ['1.']

case:3 ips= re.findall(r'(?:\\d{1,3}.){3}\\d{1,3}','192.168.1.100') print(ips) case:3 ips = re.findall(r'(?:\\ d {1,3}。){3} \\ d {1,3}','192.168.1.100')print(ips)

o/p==>['192.168.1.100'] o / p ==> ['192.168.1.100']

why the re for case1(search) didnt work for case2(findall) 为什么对case1(search)的要求对case2(findall)不起作用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM