简体   繁体   中英

Converting a list of tuples by combining neighbored elements and cutting the length of the elements

I have a list of tuples, in even length x :

[(a1,a2,a3,a4),(b1,b2,b3,b4),(c1,c2,c3,c4),(d1,d2,d3,d4), ... ,(x1,x2,x3,x4)]  

I wish to create a new list of tuples as following:

[(a1,a2,b1,b2),(c1,c2,d1,d2),(e1,e2,f1,f2), ... ,((x-1)1,(x-1)2,x1,x2)]  

As you can see, all the third and fourth elements of the sublists are gone and neighbored sublists have been merged.

I can do it with few loops, but I'm looking for the elegant Python'ish way to do it.

I pondered on the question of what is the Pythonic way , and came to the conclusion that if you do not need a sequence, then accept any iterator would be the Pythonic one. Martijn's xrange method is not the most pythonic in this sense, and indeed the Martijn's grouper method

result = [ i[:2] + j[:2] for i, j in izip(*[iter(l)] * 2) ]    

is the most pythonic in that it accepts any kind of iterable.

The [iter(l)] * 2 makes a 2-sized list with the same iterator in both elements. The reason why this is faster than my attempt with next above, though I didn't think so before, is that the call to next() is costly (all function calls and attribute/name resolves are slow on CPython relative to use of inline operators and construction of tuples).


This is the quick obvious solution, that I find the most Pythonic with strictly lists:

l = [('a1', 'a2', 'a3', 'a4'), ('b1', 'b2', 'b3', 'b4'), ('c1', 'c2', 'c3', 'c4'), ('d1', 'd2', 'd3', 'd4')]
y = [ i[:2] + j[:2] for i, j in zip(l[::2], l[1::2]) ]
print y

which prints

[('a1', 'a2', 'b1', 'b2'), ('c1', 'c2', 'd1', 'd2')]

However it is not the most efficient, as zip and the slices will create temporary lists on Python 2, thus to speed this up, one can use the itertools.izip with islice to make a generative solution.


But, actually one can abuse an iterator here to make quite efficient code, albeit not very Pythonic

i = iter(l)
n = i.next
result = [ j[:2] + n()[:2] for j in i ]
print result

prints

[('a1', 'a2', 'b1', 'b2'), ('c1', 'c2', 'd1', 'd2')]

The for in list comprehension implicitly calls next() on the iterator i and assigns the result to j ; within the list comprehension body we explicitly advance the same iterator to get the next tuple. Then with tuple slicing we take the 2 first elements of the even numbered tuple (0-based) and concatenate the 2 first elements of the next tuple to it. This code however will result in StopIteration exception if the number of elements in the original list is not even, unlike the slice'n'zip solutions which would just silently discard the last odd tuple.

Iterate over the list in pairs and produce new tuples:

[tups[i][:2] + tups[i + 1][:2] for i in xrange(0, len(tups), 2)]

where tups is your input list; use range() if you are using Python 3.

If tups is an iterable and not a sequence (so if you cannot use indexing for fast access), you can also use the itertools grouper recipe (adjusted for your usecase and adapted to work on Python 2 and 3):

try:
    # Python 2
    from future_builtins import zip
except ImportError:
    pass  # Python 3

def grouper(iterable, n):
    args = [iter(iterable)] * n
    return zip(*args)

[t1[:2] + t2[:2] for t1, t2 in grouper(tups, 2)]

but this somewhat overkill if tups is a list.

Demo:

>>> tups = [('a1', 'a2', 'a3', 'a4'), ('b1', 'b2', 'b3', 'b4'), ('c1', 'c2', 'c3', 'c4'), ('d1', 'd2', 'd3', 'd4')]  
>>> [tups[i][:2] + tups[i + 1][:2] for i in xrange(0, len(tups), 2)]
[('a1', 'a2', 'b1', 'b2'), ('c1', 'c2', 'd1', 'd2')]

Even so, the grouper option is the faster choice (which surprised me a little):

In [1]: tups = [('a', 'a', 'a', 'a')] * 1000

In [2]: from future_builtins import zip

In [3]: def grouper(iterable, n):
   ...:         args = [iter(iterable)] * n
   ...:         return zip(*args)
   ...: 

In [4]: %timeit [tups[i][:2] + tups[i + 1][:2] for i in xrange(0, len(tups), 2)]
10000 loops, best of 3: 112 µs per loop

In [5]: %timeit [t1[:2] + t2[:2] for t1, t2 in grouper(tups, 2)]
10000 loops, best of 3: 90.6 µs per loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM