簡體   English   中英

Python在文件中找到最后一次出現

[英]Python find last occurence in a file

我有一個不同IP的文件。

192.168.11.2
192.1268.11.3
192.168.11.3
192.168.11.3
192.168.11.2
192.168.11.5

這是我的代碼,直到現在。 我打印IP和出現的地方,但是我怎樣才能知道每個IP的最后一次出現的時間。 這是一個簡單的方法嗎?

liste = []

dit = {}
file = open('ip.txt','r')

file = file.readlines()

for line in file:
        liste.append(line.strip())

for element in liste:
        if element in dit:
                dit[element] +=1
        else:
                dit[element] = 1

for key,value in dit.items():
        print "%s occurs %s times, last occurence at line"  %(key,value)

輸出:

192.1268.11.3 occurs 1 times, last occurence at line
192.168.11.3 occurs 2 times, last occurence at line
192.168.11.2 occurs 2 times, last occurence at line
192.168.11.5 occurs 1 times, last occurence at line

嘗試這個:

liste = []

dit = {}
file = open('ip.txt','r')

file = file.readlines()

for line in file:
        liste.append(line.strip())

for i, element in enumerate(liste, 1):
        if element in dit:
                dit[element][0] += 1
                dit[element][1] =  i
        else:
                dit[element] = [1,i]

for key,value in dit.items():
        print "%s occurs %d times, last occurence at line %d" % (key, value[0], value[1])

這是一個解決方案:

from collections import Counter

with open('ip.txt') as input_file:
    lines = input_file.read().splitlines()

    # Find last occurrence, count
    last_line = dict((ip, line_number) for line_number, ip in enumerate(lines, 1))
    ip_count = Counter(lines)

    # Print the stat, sorted by last occurrence
    for ip in sorted(last_line, key=lambda k: last_line[k]):
        print '{} occurs {} times, last occurence at line {}'.format(
            ip, ip_count[ip], last_line[ip])            

討論

  • 我使用enumerate函數來生成行號(從第1行開始)
  • 使用(ip,line_number)序列,可以很容易地生成字典last_line ,其中鍵是IP地址,值是它發生的最后一行
  • 要計算出現次數,我使用Counter類 - 非常簡單
  • 如果您希望報告按IP地址sorted(last_line) ,請使用sorted(last_line)
  • 此解決方案具有性能影響:它掃描IP列表兩次:一次計算last_line ,一次計算ip_count 這意味着如果文件很大,這個解決方案可能並不理想
last_line_occurrence = {}
for element, line_number in zip(liste, range(1, len(liste)+1)):
     if element in dit:
            dit[element] +=1
     else:
            dit[element] = 1
     last_line_occurrence[element] = line_number

for key,value in dit.items():
     print "%s occurs %s times, last occurence at line %s"  %(key,value, last_line_occurrence[key])

這可以在一次通過中輕松完成,而無需將所有文件讀入內存:

from collections import defaultdict
d = defaultdict(lambda: {"ind":0,"count":0})

with open("in.txt") as f:
    for ind, line in enumerate(f,1):
        ip = line.rstrip()
        d[ip]["ind"] = ind
        d[ip]["count"]  += 1

for ip ,v in d.items():
    print("IP {}  appears {} time(s) and the last occurrence is at  line {}".format(ip,v["count"],v["ind"]))

輸出:

IP 192.1268.11.3  appears 1 time(s) and the last occurrence is at line 2
IP 192.168.11.3  appears 2 time(s) and the last occurrence is at line 4
IP 192.168.11.2  appears 2 time(s) and the last occurrence is at line 5
IP 192.168.11.5  appears 1 time(s) and the last occurrence is at line 6

如果您想要首次遇到ip的訂單,請使用OrderedDict:

from collections import OrderedDict
od = OrderedDict()
with open("in.txt") as f:
    for ind, line in enumerate(f,1):
        ip = line.rstrip()
        od.setdefault(ip, {"ind": 0,"count":0})
        od[ip]["ind"] = ind
        od[ip]["count"] += 1

for ip ,v in od.items():
    print("IP {}  appears {} time(s) and the last occurrence is at  line {}".format(ip,v["count"],v["ind"]))

輸出:

IP 192.168.11.2  appears 2 time(s) and the last occurrence is at line 5
IP 192.1268.11.3  appears 1 time(s) and the last occurrence is at line 2
IP 192.168.11.3  appears 2 time(s) and the last occurrence is at line 4
IP 192.168.11.5  appears 1 time(s) and the last occurrence is at line 6

你可以使用另一本字典。 在此詞典中,您為每一行存儲最后一次出現的行號,並在每次找到另一次出現時覆蓋。 最后,在這個詞典中,對於每一行,您將獲得最后一次出現的行號。

顯然,您需要為每個讀取行增加一個計數器,以便知道您正在讀取的行。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM