For some reason, I keep having 'how do I sort this list of tuples' questions. (A prior question of mine: sorting list of tuples by arbitrary key ).
Here is some arbitrary raw input:
number_of = 3 # or whatever
tuple_list = [(n, 'a', 'b', 'c') for n in xrange(number_of)] # [(0, 'a', 'b', 'c')...]
ordering_list = random.sample(range(number_of), number_of) # e.g. [1, 0, 2]
Sorting tuple_list
by ordering_list
using sorted:
ordered = sorted(tuple_list, key=lambda t: ordering_list.index(t[0]))
# ordered = [(1, 'a', 'b', 'c'), (0, 'a', 'b', 'c'), (2, 'a', 'b', 'c')]
I have a slightly awkward approach which seems to be much faster, especially as the number of elements in the tuple_list
grows. I create a dictionary, breaking the tuple
into (tuple[0], tuple[1:])
items inside dictionary list_dict
. I retrieve the dictionary item using ordering_list
as keys, and then re-assemble the sequence of (tuple[0], tuple[1:])
into a list of tuples, using an idiom I'm still trying to wrap my head around completely: zip(*[iter(_list)] * x)
where x
is the length of each tuple composed of items from _list
. So my question is: is there a version of this approach which is manages the disassemble - reassemble part of the code better?
def gen_key_then_values(key_list, list_dict):
for key in key_list:
values = list_dict[key]
yield key
for n in values:
yield n
list_dict = {t[0]: t[1:] for t in tuple_list}
ordered = zip(*[gen_key_then_values(ordering_list, list_dict)] * 4)
NOTE BETTER CODE, using an obvious comment from Steve Jessop below:
list_dict = {t[0]: t for t in tuple_list}
ordered = [list_dict[k] for k in ordering_list]
My actual project code still requires assembling a tuple for each (k, ['a', 'b' ...])
item retrieved from the list_dict
but there was no reason for me to include that part of the code here.
Breaking the elements of tuple_list
apart in the dictionary doesn't really gain you anything and requires creating a bunch more tuples for the values. All you're doing is looking up elements in the list according to their first element, so it's probably not worth actually splitting them:
list_dict = { t[0] : t for t in tuple_list }
Note that this only works if the first element is unique, but then the ordering_list
only makes sense if the first element is unique, so that's probably OK.
zip(*[iter(_list)] * 4)
is just a way of grouping _list
into fours, so give it a suitable name and you won't have to worry about it:
def fixed_size_groups(n, iterable):
return zip(*[iter(iterable)] * n)
But all things considered you don't actually need it anyway:
ordered = list(list_dict[val] for val in ordering_list)
The reason your first code is slow, is that ordering_list.index
is slow -- it searches through the ordering_list
for t[0]
, and it does this once for each t
. So in total it does (number_of ** 2) / 2
inspections of a list element.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.