简体   繁体   English

从多维列表中删除重复项(python)

[英]remove duplicates from a multidimensional list (python)

I'm trying to remove duplicate items from a multidimensional list. 我正在尝试从多维列表中删除重复的项目。 My goal is to remove items that are the same across the lists. 我的目标是删除列表中相同的项目。

For example: List 2, List 6 and List 7 contains the car Bentley. 例如:清单2,清单6和清单7包含汽车Bentley。 I want to remove that car from 2 of the lists. 我想从清单2中删除那辆车。

How do I accomplish this? 我该如何完成?

The code below only works if I pass in a single list containing duplicate entries, but I need to deduplicate a multidimensional list. 下面的代码仅在我传递包含重复条目的单个列表时才有效,但是我需要对多维列表进行重复数据删除。

cars = [
     ["Acura", "Alfa Romeo", "Aston Martin", "Audi", "Aston Martin"],
     ["Bentley", "BMW", "Bugatti", "Buick"],
     ["Cadillac", "Chrysler", "Citroen"],
     ["Dodge", "Ferrari", "Fiat", "Ford"],
     ["Geely", "Honda", "Hyundai", "Infiniti"],
     ["Alfa Romeo", "Bentley", "Hyundai", "Lamborghini"],
     ["Koenigsegg", "Bentley", "Maserati", "Lamborghini"]
    ]

def remove(duplicate):
  final_list = []
  for num in duplicate:
    if num not in final_list:
        final_list.append(num)
  return final_list


print (remove(cars))

returns:
[
 ['Acura', 'Alfa Romeo', 'Aston Martin', 'Audi','Aston Martin']
 ['Bentley', 'BMW', 'Bugatti', 'Buick'], 
 ['Cadillac', 'Chrysler', 'Citroen'], 
 ['Dodge', 'Ferrari', 'Fiat', 'Ford'], 
 ['Geely', 'Honda', 'Hyundai', 'Infiniti'], 
 ['Alfa Romeo', 'Bentley', 'Hyundai', 'Lamborghini'
 ['Koenigsegg', 'Bentley', 'Maserati', 'Lamborghini']
]

My desired output after deduplication is shown below. 重复数据删除后我想要的输出如下所示。 No list within this multidimensional list contains a duplicate entry. 此多维列表中没有列表包含重复的条目。

 [
  ['Acura', 'Alfa Romeo', 'Aston Martin', 'Audi']
  ['Bentley', 'BMW', 'Bugatti', 'Buick'], 
  ['Cadillac', 'Chrysler', 'Citroen'], 
  ['Dodge', 'Ferrari', 'Fiat', 'Ford'], 
  ['Geely', 'Honda', 'Hyundai', 'Infiniti'], 
  ['Bentley', 'Hyundai', 'Lamborghini'
  ['Koenigsegg', 'Maserati']
 ]

You can do it like this: 您可以这样做:

def remove(duplicate):
    final_list = []
    found = set([])
    for num in duplicate:
        lst = []
        for element in num:
            if element not in found:
                found.add(element)
                lst.append(element)
        final_list.append(lst)
    return final_list

Output 输出量

[
    ['Acura', 'Alfa Romeo', 'Aston Martin', 'Audi'],
    ['Bentley', 'BMW', 'Bugatti', 'Buick'],
    ['Cadillac', 'Chrysler', 'Citroen'],
    ['Dodge', 'Ferrari', 'Fiat', 'Ford'],
    ['Geely', 'Honda', 'Hyundai', 'Infiniti'],
    ['Lamborghini'],
    ['Koenigsegg', 'Maserati']
]

You can use memoization with such purpose. 您可以将备忘录用于此类目的。 Store the first occurrence of every element in a list lets say list F . 每个元素第一次出现存储在列表中 ,比如说列表F。 If an element is not in F, then it is unique . 如果元素不在F中,则它是唯一的 Store the unique element in F and and repeat the process. 将唯一元素存储在F中,然后重复该过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM