[英]How do I write only the matching regex to a new file in Python?
My goal is to extract the IP Addresses only and append them to a new file.我的目标是仅将 IP 地址和 append 地址提取到新文件中。 The file I have is called error_log.txt and has lines such as:
我拥有的文件名为 error_log.txt ,其中包含以下行:
[Sun Jun 7 16:45:56 2020] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed
[Sun Jun 7 16:45:56 2020] [info] [client 64.242.88.10] (104)Connection reset by peer:客户端在发送正文完成之前停止连接
[Sun Jun 7 17:13:50 2020] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed
[Sun Jun 7 17:13:50 2020] [info] [client 64.242.88.10] (104)Connection reset by peer:客户端在发送正文完成前停止连接
The goal is to write "64.242.88.10" and the rest of the IPs to a new file.目标是将 IP 的“64.242.88.10”和 rest 写入新文件。
I can get the print function to give me only the IPs, but when it writes to the file 'ip_only.txt' it prints the complete line from the error log.我可以打印 function 来只给我 IP,但是当它写入文件 'ip_only.txt' 时,它会打印错误日志中的完整行。
How can I just get the IPs only to the new file (in a column)?我怎样才能只获取新文件的 IP(在一列中)?
Bonus, when it does print when testing, it gives me the blank lines too.奖励,当它在测试时打印时,它也给了我空白行。 How can I omit those lines?
我怎样才能省略这些行?
import re
with open('error_log.txt', 'r') as file:
fi = file.readlines()
ip_only = open('ip_only.txt', 'w+')
re_ip = re.compile("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}")
for line in fi:
ip = re.findall(re_ip, line)
ip_only.write(str(line))
# print(ip)
You need to write the ip
variable to the file instead of line
which contains the original line:您需要将
ip
变量写入文件而不是包含原始line
的行:
for line in fi:
ip = re.findall(re_ip, line)
ip_only.write(str(ip))
# ip_only.txt:
# ['64.242.88.10']['64.242.88.10']
Additionally, to remove the brackets and quotes from your output (note that re.findall()
returns a list of strings) and print each IP address to a new line:此外,要从 output 中删除括号和引号(注意
re.findall()
返回字符串列表)并将每个 IP 地址打印到新行:
for line in fi:
ips = re.findall(re_ip, line)
for ip in ips:
ip_only.write(ip + '\n')
# ip_only.txt:
# 64.242.88.10
# 64.242.88.10
While writing into file, you are writing the whole line.在写入文件时,您正在写入整行。 instead write only the IPs as below
ip_only.write(str(ip))
而是只写下面的IP
ip_only.write(str(ip))
To avoid blank lines, you can have a if condition to check, whether the ip is found or not in the given line.为避免出现空行,您可以使用 if 条件来检查 ip 是否在给定行中找到。
for line in fi:
ip = re.findall(re_ip, line)
if ip:
ip_only.write(str(ip))
If print(ip)
gives you expected result then you should use write(ip)
instead of write(line)
如果
print(ip)
给你预期的结果,那么你应该使用write(ip)
而不是write(line)
regex gives list so you may need to write only ip[0]
.正则表达式给出列表,因此您可能只需要编写
ip[0]
。 And you need to add \n
to move to the next line.您需要添加
\n
才能移动到下一行。
ip_only.write(ip[0] + "\n")
As for empty line - first remove all spaces, tabs, enters and next compare with empty string ""
.至于空行 - 首先删除所有空格、制表符、回车,然后与空字符串
""
进行比较。 OR use fact that empty string gives False
when used in if/else
或者使用空字符串在
if/else
中使用时给出False
的事实
line = line.strip()
if line:
# ... code ...
import re
fi = [
'[Sun Jun 7 16:45:56 2020] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed',
'[Sun Jun 7 17:13:50 2020] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed',
]
ip_only = open('ip_only.txt', 'w+')
re_ip = re.compile("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}")
for line in fi:
line = line.strip()
if line:
ip = re.findall(re_ip, line)
ip_only.write(ip[0] + "\n")
print(ip[0])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.