簡體   English   中英

比較兩個列表中常見項目的最快方法

[英]Fastest way to compare common items in two lists

我有兩個這樣的列表:

listt = [["a","abc","zzz","xxx","abc","abc"],["yyy","ggg","abc","cccc"]]

我有另一個這樣的查詢列表:

queryList = ["abc","cccc","abc","yyy"]

queryList & listt[0]共有 2 個"abc"

queryListlistt[1]共有 1 個"abc" , 1 個"cccc"和 1 個"yyy"

所以我想要一個像這樣的 output :

[2,3] #2 = Total common items between queryList & listt[0]
      #3 = Total common items between queryList & listt[1]

我目前正在使用循環來執行此操作,但這似乎很慢。 我將有數百萬個列表,每個列表有數千個項目。

listt = [["a","abc","zzz","xxx","abc","abc"],["yyy","ggg","abc","cccc"]]
queryList = ["abc","cccc","abc","yyy"]

totalMatch = []
for hashtree in listt:
    matches = 0
    tempQueryHash = queryList.copy()
    for hash in hashtree:
        for i in range(len(tempQueryHash)):
            if tempQueryHash[i]==hash:
                matches +=1
                tempQueryHash[i] = "" #Don't Match the same block twice.
                break

    totalMatch.append(matches)
print(totalMatch)

好吧,我仍在學習 Python 中的技巧。 但是根據這個較早的帖子,應該可以使用以下內容:

from collections import Counter
listt = [["a","abc","zzz","xxx","abc","abc"],["yyy","ggg","abc","cccc"]]
queryList = ["abc","cccc","abc","yyy"]
OutputList = [len(list((Counter(x) & Counter(queryList)).elements())) for x in listt]
# [2, 3]

我會留意其他方法...

JvdV答案的改進。

基本上對值求和而不是對元素進行計數,並且還緩存 queryListCounter。

from collections import Counter
listt = [["a","abc","zzz","xxx","abc","abc"],["yyy","ggg","abc","cccc"]]
queryList = ["abc","cccc","abc","yyy"]
queryListCounter = Counter(queryList)
OutputList = [sum((Counter(x) & queryListCounter).values()) for x in listt]

您可以列出 listt 和 queryList 的匹配項並計算匹配的數量。

output = ([i == z for i in listt[1] for z in queryList])
print(output.count(True))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM