I am trying to make a code that compares the second element of each tuple and extract the tuples that contain duplicates of the second element.
For example, if I have
List = [(0, 2), (1, 0), (2, 1), (3, 2)]
duplicate_tuples = [(0, 2), (3, 2)] # desired output
I just cannot figure out how to designate the second element in my for iteration
for i in List: # would iterate each tuple
if i[1] of i in List is duplicate...
Lack of pythonic grammar is frustrating. How should I approach this problem?
You can collect your tuples in a collections.defaultdict()
, then report the lists that have more than one duplicate:
from collections import defaultdict
lst = [(0, 2), (1, 0), (2, 1), (3, 2), (2, 0)]
dups = defaultdict(list)
for fst, snd in lst:
dups[snd].append((fst, snd))
print([v for k, v in dups.items() if len(v) > 1])
# [[(0, 2), (3, 2)], [(1, 0), (2, 0)]]
Or keep the duplicates in a dictionary for easy lookups:
print({k: v for k, v in dups.items() if len(v) > 1})
# {2: [(0, 2), (3, 2)], 0: [(1, 0), (2, 0)]}
Working in numpy arrays would be efficient instead of list/tuples.
import numpy as np
a = np.array([(0, 2), (1, 0), (2, 1), (3, 2),(3,0)])
unique_vals,inverse_indices,counts=np.unique(a[:,1],return_inverse=True,return_counts=True)
Based on the unique function output, we can generate the duplicates list
duplicates=[(i,a[inverse_indices==i]) for i in unique_vals[np.where(counts>1)[0]]]
Output:
[(0, array([[1, 0],[3, 0]])),
(2, array([[0, 2],[3, 2]]))]
There is a chance for more duplicates, So groupby
is a better option.
In [6]: from itertools import groupby
In [7]: for g,l in groupby(sorted(lst,key=lambda x:x[1]),key=lambda x:x[1]):
...: temp = list(l)
...: if len(temp) > 1:
...: print g,temp
...:
2 [(0, 2), (3, 2)]
Here is another approach, using numpy:
duplicate_list = []
foo = np.array([(0,2), (1,0), (2,1), (3,2), (3,0), (1,2)])
for i in range(len(np.unique(foo[:,1]))):
if np.sum(foo[:,1] == i) > 1:
duplicate_list.append(foo[foo[:,1] == i].tolist())
print(duplicate_list)
Output:
[[[1, 0], [3, 0]], [[0, 2], [3, 2], [1, 2]]]
With np.unique(foo[:,1]) we get the unique elements of the second element in a tuple, and then we append it to a list if the count is greater than 1 or duplicate is present, which returns 2 lists as we have 2 occurrences (0 and 2). If you have a specific number say (2) then we can avoid the loop.
Eg
bla = np.array([(0, 2), (1, 0), (2, 1), (3, 2)])
duplicate = []
if np.sum(bla[:,1] == 2) > 1:
duplicate = bla[bla[:,1] == 2].tolist()
print(duplicate)
Output:
[[0, 2], [3, 2]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.