[英]IP address/network parsing from text file using python
I have the below text file that I would need some help with parsing out IP addresses. 我有以下文本文件,在解析IP地址时需要一些帮助。
The text file is of the form 文本文件的格式为
abc 10.1.1.1/32 aabbcc
def 11.2.0.0/16 eeffgg
efg 0.0.0.0/0 ddeeff
In other words, a bunch of IP networks exist as part of a log file. 换句话说,一堆IP网络作为日志文件的一部分存在。 The output should be provided as below: 输出应如下所示:
10.1.1.1/32
11.2.0.0/16
0.0.0.0/0
I have the below code but does not output the required information 我有以下代码,但未输出所需信息
file = open(filename, 'r')
for eachline in file.readlines():
ip_regex = re.findall(r'(?:\d{1,3}\.){3}\d{1,3}', eachline)
print ip_regex
First, your regex doesn't even attempt to capture anything but four dotted numbers, so of course it's not going to match anything else, like a /32
on the end. 首先,您的正则表达式甚至不会尝试捕获除了四个点分数字以外的任何内容,因此,它当然不会与其他任何内容匹配,例如最后一个/32
。 if you just add, eg, /\\d{1,2}
to the end, it'll fix that: 如果仅在末尾添加/\\d{1,2}
,它将解决以下问题:
(?:\d{1,3}\.){3}\d{1,3}/\d{1,2}
However, if you don't understand regular expressions well enough to understand that, you probably shouldn't be using a regex as a piece of "magic" that you'll never be able to debug or extend. 但是,如果您对正则表达式的理解不够深入,则可能不应该将正则表达式用作永远无法调试或扩展的“魔术”。 It's a bit more verbose with str
methods like split
or find
, but maybe easier to understand for a novice: str
方法(例如split
或find
更加冗长,但对于新手而言可能更容易理解:
for line in file:
for part in line.split()
try:
address, network = part.split('/')
a, b, c, d = address.split('.')
except ValueError:
pass # not in the right format
else:
# do something with part, or address and network, or whatever
As a side note, depending on what you're actually doing with these things, you might want to use the ipaddress
module (or the backport on PyPI for 2.6-3.2) rather than string parsing: 附带说明一下,根据您实际使用这些东西的情况,您可能需要使用ipaddress
模块(或2.6-3.2的PyPI上的反向端口),而不是字符串解析:
>>> import ipaddress
>>> s = '10.1.1.1/32'
>>> a = ipaddress.ip_network('10.1.1.1/32')
You can combine that with either of the above: 您可以将其与以上任何一种结合使用:
for line in file:
for part in line.split():
try:
a = ipaddress.ip_network(part)
except ValueError:
pass # not the right format
else:
# do something with a and its nifty methods
In this particular case, a regex might be overkill, you could use split
在这种情况下,正则表达式可能会过大,您可以使用split
with open(filename) as f:
ipList = [line.split()[1] for line in f]
This should produce a list of strings, which are the ip addresses. 这将产生一个字符串列表,即IP地址。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.