[英]Python find last occurence in a file
I have a file with different IP's. 我有一个不同IP的文件。
192.168.11.2
192.1268.11.3
192.168.11.3
192.168.11.3
192.168.11.2
192.168.11.5
This is my code until now. 这是我的代码,直到现在。 Where I print the IP and the occurence, but how can I found out when the last occurennce was for each of the IP's.
我打印IP和出现的地方,但是我怎样才能知道每个IP的最后一次出现的时间。 Is it a simple way to do so?
这是一个简单的方法吗?
liste = []
dit = {}
file = open('ip.txt','r')
file = file.readlines()
for line in file:
liste.append(line.strip())
for element in liste:
if element in dit:
dit[element] +=1
else:
dit[element] = 1
for key,value in dit.items():
print "%s occurs %s times, last occurence at line" %(key,value)
Output: 输出:
192.1268.11.3 occurs 1 times, last occurence at line
192.168.11.3 occurs 2 times, last occurence at line
192.168.11.2 occurs 2 times, last occurence at line
192.168.11.5 occurs 1 times, last occurence at line
Try this: 尝试这个:
liste = []
dit = {}
file = open('ip.txt','r')
file = file.readlines()
for line in file:
liste.append(line.strip())
for i, element in enumerate(liste, 1):
if element in dit:
dit[element][0] += 1
dit[element][1] = i
else:
dit[element] = [1,i]
for key,value in dit.items():
print "%s occurs %d times, last occurence at line %d" % (key, value[0], value[1])
Here is a solution: 这是一个解决方案:
from collections import Counter
with open('ip.txt') as input_file:
lines = input_file.read().splitlines()
# Find last occurrence, count
last_line = dict((ip, line_number) for line_number, ip in enumerate(lines, 1))
ip_count = Counter(lines)
# Print the stat, sorted by last occurrence
for ip in sorted(last_line, key=lambda k: last_line[k]):
print '{} occurs {} times, last occurence at line {}'.format(
ip, ip_count[ip], last_line[ip])
enumerate
function to generate line number (starting at line 1) enumerate
函数来生成行号(从第1行开始) last_line
where the key is the IP address and the value is the last line it occurs last_line
,其中键是IP地址,值是它发生的最后一行 Counter
class--very simple Counter
类 - 非常简单 sorted(last_line)
sorted(last_line)
,请使用sorted(last_line)
last_line
and once to calculate ip_count
. last_line
,一次计算ip_count
。 That means this solution might not be ideal if the file is large last_line_occurrence = {}
for element, line_number in zip(liste, range(1, len(liste)+1)):
if element in dit:
dit[element] +=1
else:
dit[element] = 1
last_line_occurrence[element] = line_number
for key,value in dit.items():
print "%s occurs %s times, last occurence at line %s" %(key,value, last_line_occurrence[key])
This can easily be done in a single pass without reading all the file into memory: 这可以在一次通过中轻松完成,而无需将所有文件读入内存:
from collections import defaultdict
d = defaultdict(lambda: {"ind":0,"count":0})
with open("in.txt") as f:
for ind, line in enumerate(f,1):
ip = line.rstrip()
d[ip]["ind"] = ind
d[ip]["count"] += 1
for ip ,v in d.items():
print("IP {} appears {} time(s) and the last occurrence is at line {}".format(ip,v["count"],v["ind"]))
Output: 输出:
IP 192.1268.11.3 appears 1 time(s) and the last occurrence is at line 2
IP 192.168.11.3 appears 2 time(s) and the last occurrence is at line 4
IP 192.168.11.2 appears 2 time(s) and the last occurrence is at line 5
IP 192.168.11.5 appears 1 time(s) and the last occurrence is at line 6
If you want the order the ip's are first encountered use an OrderedDict: 如果您想要首次遇到ip的订单,请使用OrderedDict:
from collections import OrderedDict
od = OrderedDict()
with open("in.txt") as f:
for ind, line in enumerate(f,1):
ip = line.rstrip()
od.setdefault(ip, {"ind": 0,"count":0})
od[ip]["ind"] = ind
od[ip]["count"] += 1
for ip ,v in od.items():
print("IP {} appears {} time(s) and the last occurrence is at line {}".format(ip,v["count"],v["ind"]))
Output: 输出:
IP 192.168.11.2 appears 2 time(s) and the last occurrence is at line 5
IP 192.1268.11.3 appears 1 time(s) and the last occurrence is at line 2
IP 192.168.11.3 appears 2 time(s) and the last occurrence is at line 4
IP 192.168.11.5 appears 1 time(s) and the last occurrence is at line 6
You can use another dictionary. 你可以使用另一本字典。 In this dictionary you store, for each line, the line number of the last occurrence and overwrite every time you find another occurrence.
在此词典中,您为每一行存储最后一次出现的行号,并在每次找到另一次出现时覆盖。 At the end, in this dictionary you will have, for each line, the line number of the last occurrence.
最后,在这个词典中,对于每一行,您将获得最后一次出现的行号。
Obviously you will need to increment a counter for each read line in order to know the line you're reading right now. 显然,您需要为每个读取行增加一个计数器,以便知道您正在读取的行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.