Python 中兩個列表的交集避免冗余匹配

Question

我有兩個 python 列表match1和match2

match1 = ['Submit', 'paper', 'error', 'code', 'notcomplete', 'next']

match2 = ['Submit', 'paper', 'error', 'code', 'blocked', 'paper', 'next']

現在，我使用__ contains __來查找match1和match2之間的常用詞

common = filter(set(match1).__contains__,match2)

print(list(common))

這給了我以下 output

['Submit', 'paper', 'error', 'code', 'paper', 'next']

第二個列表中的單詞paper再次與第一個列表中的單詞paper匹配。

有沒有辦法避免這種情況並獲得以下 output？

['Submit', 'paper', 'error', 'code', 'next']

編輯：

列表的順序很重要。 這就是我使用 __ 包含 __ 而不是交集的原因。

我擔心的不是常用詞列表中是否存在重復條目。 我試圖避免match1中的相同條目再次與match2中的另一個條目匹配。 如果兩個列表都有兩個'paper' ，我會將它作為單獨的條目放在公共列表中。

Answer 1

您可以將列表轉換為集合，然后使用集合中的交集 function 查找常用詞。

match1 = ['Submit' , 'paper' , 'error' , 'code' , 'notcomplete' , 'next']
match2 = ['Submit' , 'paper' , 'error' , 'code' , 'blocked', 'paper' , 'next']
match1_set = set(match1)
match2_set = set(match2)
print(match1_set.intersection(match2_set))

Answer 2

如果您只是使用以下方法刪除重復項怎么辦：

print(list(dict.fromkeys(list(common)))

Answer 3

您可以使用列表理解，然后通過set()運行它

match1 = ['Submit' , 'paper' , 'error' , 'code' , 'notcomplete' , 'next']
match2 = ['Submit' , 'paper' , 'error' , 'code' , 'blocked', 'paper' , 'next']
print(set([x for x in match1 if x in match2]))

Answer 4

嗨 Krishnadas 歡迎來到 StackOverflow，

您可以將列表轉換為集合，這些集合是不允許重復的數據結構。 我看到你已經在match1上使用了它，但你也可以在match2上使用它們：

common = filter(set(match1).__contains__,set(match2))

Answer 5

我已經能夠找到答案。

match1 = ['Submit', 'paper', 'error', 'code', 'notcomplete', 'next']

match2 = ['Submit', 'paper', 'error', 'code', 'blocked', 'paper', 'next']

common = list(filter(match1.__contains__,match2))

common_final = [w for w in match1 if w in common]

print(common_final)

給

['Submit', 'paper', 'error', 'code', 'next']

感謝所有幫助過的人。

Python 中兩個列表的交集避免冗余匹配

問題描述

5 個解決方案

解決方案1
2 2020-06-17 09:15:52

解決方案2
0 2020-06-17 09:13:17

解決方案3
0 2020-06-17 09:14:44

解決方案4
0 2020-06-17 09:15:55

解決方案5
0 2020-06-17 10:20:48

Python 中兩個列表的交集避免冗余匹配

問題描述

5 個解決方案

解決方案1 2 2020-06-17 09:15:52

解決方案2 0 2020-06-17 09:13:17

解決方案3 0 2020-06-17 09:14:44

解決方案4 0 2020-06-17 09:15:55

解決方案5 0 2020-06-17 10:20:48

解決方案1
2 2020-06-17 09:15:52

解決方案2
0 2020-06-17 09:13:17

解決方案3
0 2020-06-17 09:14:44

解決方案4
0 2020-06-17 09:15:55

解決方案5
0 2020-06-17 10:20:48