I have a list of tuples similar to A:
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
In each row of this list, there might be tuples which their second and third elements are the same. For example in A[0]:
A[0] = [(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)]
(90, 1, 5), (1000, 1, 5) and (176, 1, 5) have the same second and third elements. Among these, I need to keep the one which has the max value for the first element and remove the other two. So, I should be able to keep (1000, 1, 5) and remove (90, 1, 5) and (176, 1, 5) from A[0].
It would be better to keep the ordering of the list.
Is there any way to do that iteratively for all the rows in A? Any help would be appreciated!
If I understand correctly, here's an itertools.groupby
solution. I'm assuming order in the final result does not matter.
from itertools import groupby
def keep_max(lst, groupkey, maxkey):
'groups lst w.r.t. to groupkey, keeps maximum of each group w.r.t. maxkey'
sor = sorted(lst, key=groupkey)
groups = (tuple(g) for _, g in groupby(sor, key=groupkey))
return [max(g, key=maxkey) for g in groups]
In action:
>>> from operator import itemgetter
>>> groupkey = itemgetter(1, 2)
>>> maxkey = itemgetter(0)
>>> A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)], [(160, 2, 5), (1000, 2, 5), (111, 1, 2)], [(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)], [(128, 3, 4)], [(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
>>>
>>> [keep_max(sub, groupkey, maxkey) for sub in A]
[[(111, 1, 2), (139, 1, 3), (1000, 1, 5)],
[(111, 1, 2), (1000, 2, 5)],
[(139, 1, 3), (128, 3, 4), (134, 3, 5)],
[(128, 3, 4)],
[(1000, 1, 5), (1000, 2, 5), (134, 3, 5)]]
This solution keeps the original ordering of the tuples assuming each tuple (as a whole) is unique; in the case there are duplicates tuples this will return the last appearance of each tuple:
from operator import itemgetter
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
def uniques(lst):
groups = {}
for t in lst:
groups.setdefault(t[1:], []).append(t)
lookup = {t: i for i, t in enumerate(lst)}
index = lookup.get
first = itemgetter(0)
return sorted(map(lambda x: max(x, key=first), groups.values()), key=index)
result = [uniques(a) for a in A]
print(result)
Output
[[(139, 1, 3), (1000, 1, 5), (111, 1, 2)], [(1000, 2, 5), (111, 1, 2)], [(134, 3, 5), (128, 3, 4), (139, 1, 3)], [(128, 3, 4)], [(134, 3, 5), (1000, 2, 5), (1000, 1, 5)]]
Using dictionaries:
fin = []
for row in A:
dict = {}
for tup in row:
dict[tup[1:2]] = tup[0]
fin.append(dict)
A = [[value, t1, t1] for (t1, t2), value in dict.iteritems()]
Using this, your dict will transform A[0] from
A[0] = [(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)]
to
{ (1,5): 1000, (1,3): 139, (1,2): 111 } (as a dict)
and can then be converted back to an array using iteritems
This way, the order will also be preserved.
If you can afford to ignore ordering, you can use itertools.groupby
to group elements by the 2nd and 3rd elements on a list sorted by ascending order of 2nd and 3rd elements and descending order of the first element. Then the first element of each group is your desired choice:
from itertools import groupby
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
def max_duplicate(lst):
res = []
for k, g in groupby(sorted(lst, key=lambda x: (x[1], x[2], -x[0])), key=lambda x: (x[1], x[2])):
res.append(next(g))
return res
result = [max_duplicate(l) for l in A]
for r in result:
print(r)
Output
[(111, 1, 2), (139, 1, 3), (1000, 1, 5)]
[(111, 1, 2), (1000, 2, 5)]
[(139, 1, 3), (128, 3, 4), (134, 3, 5)]
[(128, 3, 4)]
[(1000, 1, 5), (1000, 2, 5), (134, 3, 5)]
You can do this by using a hashmap as follows:
d = {}
for a in A:
for aa in a:
v, k1, k2 = aa
if (k1, k2) in d:
d[(k1, k2)] = max(v, d[(k1, k2)])
else:
d[(k1, k2)] = v
l = [[v, k1, k2] for (k1, k2), v in d.iteritems()]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.