刪除 python 列表中的重復項

Question

我有一個動態列表：

[{'dashboard': 'AG', 'end_date': '2021-06-17 13:13:43', 'location': 'EC & pH Reading', 'zone_name': 'Zone 1 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-17 12:40:06', 'location': 'Harvest', 'zone_name': 'Zone 2 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:52:52', 'location': 'Harvest', 'zone_name': 'Zone 1 Left' }, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:45:51', 'location': 'Harvest', 'zone_name': 'Zone 1 Left'}]

我想刪除基於 zone_name 和位置的重復項。 zone_name 中有 3 個值。 我想刪除舊的。 我已經使用 end_date 進行了排序。 最新的將排在最前面。 現在我需要刪除基於 zone_name 和位置的重復值。

這是我試過的：

final_zone = []
res_list = []
for i in sortedArray:
     if i["location"] not in final_zone:
          sch.append(i)
          final_zone.append(i["location"])

我需要做哪些更改才能根據 zone_name 和位置刪除重復項。

那是在左邊的 1 區，有 3 個值，我需要最新的一個

Answer 1

對於未排序列表的一般方法：

from itertools import groupby
from operator import itemgetter

# sorting and grouping functions
f_sort = itemgetter("location", "zone_name", "end_date")  # sort by descending
f_group = itemgetter("location", "zone_name")  # group sorted by

result = [
    next(g) for _, g in  # only take latest of each group
    groupby(sorted(array, key=f_sort, reverse=True), key=f_group)
]

這里有一些關於使用過的實用程序的文檔（所有這些在很多用例中都非常方便）：

Answer 2

您可以循環遍歷列表並記住要保留的索引。

keepers = {}
for i in range(len(sorted_array)):
    keepers(sorted_array[i]['location'])=i ## Will be overwritten if the zone_name repeats

final_array = []
for i in keepers.values():
    final_array.append(sorted_array[i])

作為獎勵，您會在keepers.keys()中獲得所有區域的列表。

但是您的方法實際上也可能有效。 只需將sch.append(i)更改為res_list.append(i)並更改可迭代對象的順序（ for i in sorted_array[::-1] ），以便保留最后一個而不是第一個。

Answer 3

clean_list=[]

for elem in lst:
    # control if an element with the same zone name and location
    # is yet present in the clean list
    yet_present= len([el for el in clean_list
                if el['zone_name']==elem['zone_name']
                if el['location']==elem['location']])>0
    if not yet_present:
        clean_list.append(elem)

OUTPUT：

[{'dashboard': 'AG',
  'end_date': '2021-06-17 13:13:43',
  'location': 'EC & pH Reading',
  'zone_name': 'Zone 1 Left'},
 {'dashboard': 'AG',
  'end_date': '2021-06-17 12:40:06',
  'location': 'Harvest',
  'zone_name': 'Zone 2 Left'},
 {'dashboard': 'AG',
  'end_date': '2021-06-16 15:52:52',
  'location': 'Harvest',
  'zone_name': 'Zone 1 Left'}]

Answer 4

其他答案有效，但我想使用Pandas添加解決方案

您可以從字典列表中創建一個 dataframe：

import pandas as pd
d = [{'dashboard': 'AG', 'end_date': '2021-06-17 13:13:43', 'location': 'EC & pH Reading', 'zone_name': 'Zone 1 Left'}, {'dashboard': 'AG', 'end_date': '2021-06-17 12:40:06', 'location': 'Harvest', 'zone_name': 'Zone 2 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:52:52', 'location': 'Harvest', 'zone_name': 'Zone 1 Left' }, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:45:51', 'location': 'Harvest', 'zone_name': 'Zone 1 Left'}]
df = pd.DataFrame(d)

這是 df 的樣子：

dashboard             end_date         location    zone_name
0        AG  2021-06-17 13:13:43  EC & pH Reading  Zone 1 Left
1        AG  2021-06-17 12:40:06          Harvest  Zone 2 Left
2        AG  2021-06-16 15:52:52          Harvest  Zone 1 Left
3        AG  2021-06-16 15:45:51          Harvest  Zone 1 Left

有點像 excel 中的一張桌子。

現在只需一行，您就可以完全按照自己的意願行事：

df.sort_by("end_date").drop_duplicates(["location", "zone_name"], keep="last")

output：

  dashboard             end_date         location    zone_name
2        AG  2021-06-16 15:52:52          Harvest  Zone 1 Left
1        AG  2021-06-17 12:40:06          Harvest  Zone 2 Left
0        AG  2021-06-17 13:13:43  EC & pH Reading  Zone 1 Left

刪除 python 列表中的重復項

問題描述

4 個解決方案

解決方案1
1 2021-06-17 08:55:49

解決方案2
0 2021-06-17 08:49:18

解決方案3
0 已采納 2021-06-17 08:50:45

解決方案4
0 2021-06-17 08:54:42

刪除 python 列表中的重復項

問題描述

4 個解決方案

解決方案1 1 2021-06-17 08:55:49

解決方案2 0 2021-06-17 08:49:18

解決方案3 0 已采納 2021-06-17 08:50:45

解決方案4 0 2021-06-17 08:54:42

解決方案1
1 2021-06-17 08:55:49

解決方案2
0 2021-06-17 08:49:18

解決方案3
0 已采納 2021-06-17 08:50:45

解決方案4
0 2021-06-17 08:54:42