[英]Comparing CSV matching rows with Python
I have two CSVs containing only one column, each:我有两个 CSV 只包含一列,每个:
littleListIPs.csv:
10.187.172.140
10.187.172.141
10.187.172.142
10.187.172.143
10.187.172.144
10.187.172.145
10.187.172.154
10.187.172.155
(...)
- ——
BigListIPs.csv:
10.187.172.146
10.187.172.147
10.187.172.148
10.187.172.149
10.187.172.150
10.187.172.151
10.187.172.152
10.187.172.153
10.187.172.154
10.187.172.155
(...)
I need a script that will compare them and create a third file (output.csv), containing every single row from littleListIPs.csv, and a column that confirms if that IP exists on the BigListIPs.csv file, like in the following output (you can place ";" instead of "|"):我需要一个脚本来比较它们并创建第三个文件 (output.csv),其中包含 littleListIPs.csv 中的每一行,以及一个确认该 IP 是否存在于 BigListIPs.csv 文件中的列,如下面的输出(你可以用“;”代替“|”):
10.187.172.140 | Not present in BigListIPs.csv
10.187.172.141 | Not present in BigListIPs.csv
10.187.172.142 | Not present in BigListIPs.csv
10.187.172.143 | Not present in BigListIPs.csv
10.187.172.144 | Not present in BigListIPs.csv
10.187.172.145 | Not present in BigListIPs.csv
10.187.172.154 | Present in BigListIPs.csv
10.187.172.155 | Present in BigListIPs.csv
I have seen a similar case that was solved here in Stack ( Python: Comparing two CSV files and searching for similar items ), but I could not manipulate it well for my needs, even being a simpler case.我见过一个类似的案例,在 Stack 中解决了( Python:比较两个 CSV 文件并搜索类似的项目),但我无法根据我的需要很好地操作它,即使是一个更简单的案例。 Thanks for any help.
谢谢你的帮助。
Written in python 2.x, since that's what I have handy.用 python 2.x 编写,因为这是我手头的东西。
in
an array is O(n), checking for in
a set is O(1).in
一个数组为O(n),检查in
一组是O(1)。with
to open the files, which is good practice and makes sure they were closed properly.with
打开文件,这是一种很好的做法,可确保它们已正确关闭。 code:代码:
#!/usr/bin/env python
import csv
little_ip_filename = "littleListIPs.csv"
big_ip_filename = "BigListIPs.csv"
output_filename = "results.csv"
# Load all the entries from BigListIPs into a set for quick lookup.
big_ips = set()
with open(big_ip_filename, 'r') as f:
big_ip = csv.reader(f)
for csv_row in big_ip:
big_ips.add(csv_row[0])
# print big_ips
with open(little_ip_filename, 'r') as input_file, open(output_filename, 'w') as output_file:
input_csv = csv.reader(input_file)
output_csv = csv.writer(output_file)
for csv_row in input_csv:
ip = csv_row[0]
status = "Present" if ip in big_ips else "Not Present"
output_csv.writerow([ip, status + " in BigListIPs.csv"])
littleListIPs.csv: littleListIPs.csv:
10.187.172.140
10.187.172.141
10.187.172.142
10.187.172.143
10.187.172.144
10.187.172.145
10.187.172.154
10.187.172.155
BigListIPs.csv: BigListIPs.csv:
10.187.172.146
10.187.172.147
10.187.172.148
10.187.172.149
10.187.172.150
10.187.172.151
10.187.172.152
10.187.172.153
10.187.172.154
10.187.172.155
results.csv:结果.csv:
10.187.172.140,Not Present in BigListIPs.csv
10.187.172.141,Not Present in BigListIPs.csv
10.187.172.142,Not Present in BigListIPs.csv
10.187.172.143,Not Present in BigListIPs.csv
10.187.172.144,Not Present in BigListIPs.csv
10.187.172.145,Not Present in BigListIPs.csv
10.187.172.154,Present in BigListIPs.csv
10.187.172.155,Present in BigListIPs.csv
You can just use in
to check if IP is in BigList
and then write to third file您可以使用
in
检查 IP 是否在BigList
,然后写入第三个文件
littlelistIPs = ['10.187.172.140', '10.187.172.141', '10.187.172.142', '10.187.172.143',
'10.187.172.144', '10.187.172.145', '10.187.172.154', '10.187.172.155']
biglistIPs = ['10.187.172.146', '10.187.172.147', '10.187.172.148', '10.187.172.149',
'10.187.172.150', '10.187.172.151', '10.187.172.152', '10.187.172.153',
'10.187.172.154', '10.187.172.155']
with open('output.csv', 'w') as f:
for i in littlelistIPs:
if i in biglistIPs:
f.write(i + ' | present in BigListIPs.csv\n')
else:
f.write(i + ' | Not present in BigListIPs.csv\n')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.