[英]Remove duplicates from a list of a list of unordered dictionaries
考慮以下:
[
[
{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}
],
[
{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19},
{'name': 'bob', 'score': 99}
],
[
{'name': 'bob', 'score': 99},
{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19}
],
[
{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}
]
]
忽略每個列表中字典的順序,如何刪除重復項,以便輸出只有兩個列表:一個是 bob,另一個是 stu?
輸出類似:
[
[
{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}
],
[
{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}
]
]
你可以試試這樣的
dict_list = [[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}],
[{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19},
{'name': 'bob', 'score': 99}],
[{'name': 'bob', 'score': 99},
{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19}],
[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}]]
# create list of names you've seen before
name_lists = []
# create lists of unique lists
unique_lists = []
# loop over each list you have
for L in dict_list:
# get list of names
names = [i['name'] for i in L]
# check if you've seen this set of names before
if set(names) not in [set(n) for n in name_lists]:
print(names)
# save these names
name_lists.append(names)
# add this list to your list of unique names
unique_lists.append(L)
輸出:
['fred', 'frank', 'bob']
['fred', 'frank', 'stu']
unique_lists
輸出:
[[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}],
[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}]]
請注意,此方法將僅保存第一組唯一名稱的分數,並在名稱組重復時丟棄分數。 如果預計相同的名稱可能會有不同的分數,您可能希望保存每組唯一的分數。 在這種情況下,您可以按照以下 PacketLoss 給出的方法進行操作:
name_lists = []
unique_lists = []
for di, d in enumerate(dict_list):
# get list of name, score tuples
r = [(i['name'], i['score']) for i in d]
# sort tuples alphabetically by name
r.sort(key=lambda tup: tup[0])
# check if these names and scores have been seen before
if r not in name_lists:
name_lists.append(r)
unique_lists.append(dict_list[di])
由於排序被關閉,簡單的==
將不匹配,我們可以通過收集數據、將其排序為元組列表並檢查之前是否已經看到匹配來解決這個問題。
data = [[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}],
[{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19},
{'name': 'bob', 'score': 99}],
[{'name': 'bob', 'score': 99},
{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19}],
[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}]]
seen = list()
result = list()
for idx, d in enumerate(data):
r = [(i['name'], i['score']) for i in d]
r.sort(key=lambda tup: tup[0])
if r not in seen:
seen.append(r)
result.append(data[idx])
使用這種方法,我們會檢查分數和名稱是否完全匹配,這意味着如果重復中的一個分數更改為98
,它將不再被視為重復。
輸出:
[[{'name': 'fred', 'score': 19}, {'name': 'frank', 'score': 100}, {'name': 'bob', 'score': 99}], [{'name': 'fred', 'score': 19}, {'name': 'frank', 'score': 100}, {'name': 'stu', 'score': 69}]]
修改數據分數的輸出:
data = [[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'bob', 'score': 99}],
[{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19},
{'name': 'bob', 'score': 99}],
[{'name': 'bob', 'score': 98},
{'name': 'frank', 'score': 100},
{'name': 'fred', 'score': 19}],
[{'name': 'fred', 'score': 19},
{'name': 'frank', 'score': 100},
{'name': 'stu', 'score': 69}]]
[[{'name': 'fred', 'score': 19}, {'name': 'frank', 'score': 100}, {'name': 'bob', 'score': 99}], [{'name': 'bob', 'score': 98}, {'name': 'frank', 'score': 100}, {'name': 'fred', 'score': 19}], [{'name': 'fred', 'score': 19}, {'name': 'frank', 'score': 100}, {'name': 'stu', 'score': 69}]]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.