刪除元組元組中的重復項

Question

我有以下元組元組：

# my Noah's Ark    
myanimals = (('cat', 'dog'), ('callitrix', 'platypus'), ('anaconda', 'python'), ('mouse', 'girafe'),   ... ,('platypus', 'callitrix'))

由於我想要一個唯一的 2 元組動物列表，因此這對 ('platypus', 'callitrix') 被認為是 ('callitrix', 'platypus') 的副本。

我怎樣才能優雅地從我的動物（用最少的代碼）中刪除（a，b）的所有類型的對（b，a）重復？

Answer 1

我分兩部分回答：

嚴格來說不是對您的問題的回答，而是一個可以讓您更輕松地使用它的建議：如果您的代碼允許使用set而不是tuple ，您可以使用關鍵字in來檢查您需要的內容：

myanimals = ({'cat', 'dog'}, {'callitrix', 'platypus'}, {'anaconda', 'python'}, {'mouse', 'girafe'},   ... {('platypus', 'callitrix')})
{'platypus', 'callitrix'} in myanimals # returns True, since {'a', 'b'}=={'b', 'a'}

因此，制作一組集合將使重復項被自動刪除：

myanimals = {{'cat', 'dog'}, {'callitrix', 'platypus'}, {'anaconda', 'python'}, {'mouse', 'girafe'},   ..., {'platypus', 'callitrix'} }

將自動刪除重復的{'platypus', 'callitrix'} 。

然而，這樣做意味着你不能讓成對的動物成為相同的兩個動物，因為{'a', 'a'}只是{'a'} 。

實際上使用元組有點麻煩。 由於元組是不可變的，因此您需要從頭開始創建一個新元組，並在此過程中過濾掉重復項：

myanimals = (('cat', 'dog'), ('callitrix', 'platypus'), ('anaconda', 'python'), ('mouse', 'girafe'),   ... ,('platypus', 'callitrix'))
myanimals_clean = []
for pair in myanimals:
   if pair not in myanimals_clean and (pair[1], pair[0]) not in myanimal_clean:
       myanimals_clean.append(pair)

您可以使用itertools.permutations()稍微清理一下，但我認為額外導入的麻煩不值得。

最后，您可以混合使用這兩個答案，並將您的元組元組轉換為集合元組以進行檢查，然后再返回元組：

myanimals = tuple( (set(pair) for pair in myanimals) )
myanimals = tuple( (tuple(pair) for pair in myanimals if pair not in myanimals) )

Answer 2

您可以在已排序的元組值上使用集合，或將列表轉換為字典，其中鍵是按排序順序的元組。 這將只留下每個組合一個值：

list({*map(tuple,map(sorted,myanimals))})

或者

list(dict(zip(map(tuple,map(sorted,myanimals)),myanimals)).values())

壞掉了

[*map(sorted,myanimals)] # sorted tuples

# [['cat', 'dog'], ['callitrix', 'platypus'], ['anaconda', 'python'], ['girafe', 'mouse'], ['callitrix', 'platypus']]

# notice that both ('callitrix', 'platypus') and ('platypus', 'callitrix')
# are converted to ('callitrix', 'platypus')

由於這給出了一個列表列表，並且字典鍵需要是可散列的，我們將項目轉換為元組：

[*map(tuple,map(sorted,myanimals))]

# [('cat', 'dog'), ('callitrix', 'platypus'), ('anaconda', 'python'), ('girafe', 'mouse'), ('callitrix', 'platypus')]

通過將它們放在一個集合中並將集合轉換回列表，這些已經可以轉換為唯一對的列表：

list({*map(tuple,map(sorted,myanimals))})

# [('girafe', 'mouse'), ('callitrix', 'platypus'), ('anaconda', 'python'), ('cat', 'dog')]

如果您不關心每個元組中值的原始順序，則可以停止。 但是，如果您需要 ('mouse','girafe') 保持該順序，那么我們需要一個額外的步驟來將唯一性過濾與元組內容分開。 這就是字典的用武之地。我們希望將這些排序的元組用作鍵，但保留原始順序作為值。 zip function 通過將關鍵部分與原始元組組合來實現這一點：

[*zip(map(tuple,map(sorted,myanimals)),myanimals)]

# [(('cat', 'dog'), ('cat', 'dog')), (('callitrix', 'platypus'), ('callitrix', 'platypus')), (('anaconda', 'python'), ('anaconda', 'python')), (('girafe', 'mouse'), ('mouse', 'girafe')), (('callitrix', 'platypus'), ('platypus', 'callitrix'))]

將其輸入字典只會保留每個不同鍵的最后一個值，我們可以簡單地選取這些值來形成元組的結果列表：

list(dict(zip(map(tuple,map(sorted,myanimals)),myanimals)).values())
  
[('cat', 'dog'), ('platypus', 'callitrix'), ('anaconda', 'python'), ('mouse', 'girafe')]

或者

請注意，上面選擇的 ('platypus', 'callitrix') 優於 ('platypus', 'callitrix') 因為它保留了最后一次出現的重復條目。

如果您需要保留第一次出現，您可以使用不同的方法，根據每個元組第一次添加到集合中逐步填充一組元組順序和過濾器。

[t for s in [{myanimals}] for t in myanimals 
   if t not in s and not s.update((t,t[::-1]))]
  
# [('cat', 'dog'), ('callitrix', 'platypus'), ('anaconda', 'python'), ('mouse', 'girafe')]

刪除元組元組中的重復項

問題描述

2 個解決方案

解決方案1
0 2021-01-19 19:41:47

解決方案2
0 2021-01-19 20:49:47

刪除元組元組中的重復項

問題描述

2 個解決方案

解決方案1 0 2021-01-19 19:41:47

解決方案2 0 2021-01-19 20:49:47

解決方案1
0 2021-01-19 19:41:47

解決方案2
0 2021-01-19 20:49:47