Removing duplicates from a list of lists in Python using deep copy

Question

I have a list of dictonaries - list_1 = [{'account': '1234', 'email': 'abc@xyz.com'}, ... , ...] I wanted to remove the entries with duplicate emails in the list.

import copy
list_2 = copy.deepcopy(list_1)
for i in mainList
 for j in range(len(list_2)-1, -1, -1):
   if ((list_2[j]["email"] == mainList[i])):
                    list_1.remove(list1[j])

MainList here is the list of emails with which I am comparing values with. mainList looks like: ['abc@xyz.com', 'efg@cvb.com, ..., ...] The main problem is list_1 is not coming out correctly. If I use list, or slicing or even list comprehension to copy it, it will come out empty. The end result should give list_1 containing only one element/list/dictonary for each email. Using copy or deep copy at least gives me something. It also seems like sometimes I am getting an indexing error. using

for x in list_2:

instead, returns list_1 with only one item. The closest I got to the correct answer was iterating over list_1 itself while removing items but it was not 100% correct. Please help.

Answer 1

iterate over your list of dictionaries and keep saving every email in a new dictionary only if it is not already present.

temp = dict()
list_1 = [{'account': '1234', 'email': 'abc@xyz.com'}]
for d in list_1:
    if d['email'] in temp:
        continue
    else:
        temp[d['email']] = d
final_list = list(temp.values())

Answer 2

Seems like you want to remove duplicate dictionaries. Please mention the duplicate dictionaries also in the problem.

di = [{'account': '1234', 'email' : 'abc@xyz.com'}, {'account1': '12345', 
'email1' : 'abcd@xyz.com'}, {'account': '1234', 'email' : 'abc@xyz.com'}]
s=[i for n, i in enumerate(d) if i not in di[n + 1:]]
Print(s)

This would give you required output

[{'account1': '12345', 'email1': 'abcd@xyz.com'}, {'account': '1234', 'email': 
'abc@xyz.com'}]

Answer 3

The easiest way I feel to accomplish this is to create an indexed version of list_1 (a dictionary) based on your key.

list_1 = [
    {'account': '1234', 'email' : 'abc@xyz.com'},
    {'account': '1234', 'email' : 'abc@xyz.com'},
    {'account': '4321', 'email' : 'zzz@xyz.com'},
]

list_1_indexed = {}
for row in list_1:
    list_1_indexed.setdefault(row['email'], row)
list_2 = list(list_1_indexed.values())

print(list_2)

This will give you:

[
    {'account': '1234', 'email': 'abc@xyz.com'},
    {'account': '4321', 'email': 'zzz@xyz.com'}
]

I'm not sure I would recommend it, but if you wanted to use a comprehension you might do:

list_2 = list({row['email']: row for row in list_1}.values())

Note that the first strategy results in the first key row wins and the comprehension the last key row wins.

Removing duplicates from a list of lists in Python using deep copy

Question

3 answers

solution1
0 ACCPTED 2022-01-25 13:46:43

solution2
0 2022-01-25 13:51:59

solution3
0 2022-01-25 13:56:28

Removing duplicates from a list of lists in Python using deep copy

Question

3 answers

solution1 0 ACCPTED 2022-01-25 13:46:43

solution2 0 2022-01-25 13:51:59

solution3 0 2022-01-25 13:56:28

solution1
0 ACCPTED 2022-01-25 13:46:43

solution2
0 2022-01-25 13:51:59

solution3
0 2022-01-25 13:56:28