简体   繁体   中英

Split a list in sublists based on the difference between consecutive values

I have a list with values for which each value has at least one (but often more) consecutive value(s) that have a .033 increment:

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]

I would like to split this list into sublists where consecutive items that differ by .033 are combined, and when the difference is larger to start a new sublist:

l = [ [26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543] ] 

Keep track of the last element you saw and either append the current item to the last sublist, or create a new sublist if the difference is greater than your allowed delta.

res, last = [[]], None
for x in l:
    if last is None or abs(last - x) <= 0.033:
        res[-1].append(x)
    else:
        res.append([x])
    last = x

Note, however, that a value of 0.033 will in fact not return the result that you want, as some of the differences are considerably more (0.037) or just slightly more due to floating point rounding. Instead, you might want to use a slightly more generous value, eg, using 0.035 gives you [[26.051, 26.084, 26.117, 26.15, 26.183], [31.146], [31.183], [34.477, 34.51, 34.543]]

One can use temporary lists and for loop to get the desired result:

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]
outlist = []
templist = [l.pop(0)]
while len(l)>0:
    x = l.pop(0)
    if x - templist[-1] > 0.04:
        outlist.append(templist)
        templist = [x]
    else: 
        templist.append(x)
outlist.append(templist)
print(outlist)

Output:

[[26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543]]

If you're a fan of itertools , you could use itertools.groupby() for this:

from itertools import groupby

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]

def keyfunc(x):
    return (x[0] > 0 and round(l[x[0]] - l[x[0]-1], 3) == 0.033 or
            x[0] < len(l) - 1 and round(l[x[0]+1] - l[x[0]], 3) == 0.033)

print([[x[1] for x in g] for k, g in groupby(enumerate(l), key=keyfunc)])

Output:

[[26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543]]

As far as the logic is concerned, the key function returns different keys for numbers that have neighbors with the difference of 0.033 and those that don't. Then groupby() groups them accordingly.

My approach involves running through pairs of consecutive numbers and examine the gaps between them, just like everybody else's. The difference here is in the use of iter() to create two iterables from one list.

# Given:
l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]
gap = 0.033

# Make two iterables (think: virtual lists) from one list
previous_sequence, current_sequence = iter(l), iter(l)

# Initialize the groups while advancing current_sequence by 1
# element at the same time
groups = [[next(current_sequence)]]

# Iterate through pairs of numbers
for previous, current in zip(previous_sequence, current_sequence):
    if abs(previous - current) > gap:
        # Large gap, we create a new empty sublist
        groups.append([])

    # Keep appending to the last sublist
    groups[-1].append(current)

print(groups)

A few notes

  • My solution looks long, but if you subtract all the comments, blank likes, and the last print statement, it is only 6 lines
  • It is efficient because I did not actually duplicate the list
  • An empty list (empty l ) will generate a StopIteration exception, so please ensure the list is not empty

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM