简体   繁体   中英

Python - filling gaps in list

I have a list made of tuples, generated from a database query, eg:

list = [(0,1,1), (1,2,1), (2,4,3), (4,2,1)]

The first number in each tuple must be a consecutive number, going from 0 to 15. There may also be missing ones, and I'm looking for the best method to fill gaps.

Currently I do this by looping through, but being the Python noob I am I figure it's sloppy and there's better ways:

# first fill in gaps
cnt = 0
for a,b,c in list:
    if a > cnt:
        list.insert(cnt, tuple((cnt, 0, 0)))
    cnt += 1

# then add any missing at end
while cnt < 16:
    list.append(tuple((cnt, 0, 0)))
    cnt += 1

So, expected output for the list at start would be:

list = [(0,1,1), (1,2,1), (2,4,3), (3,0,0), (4,2,1), (5,0,0), (6,0,0), (7,0,0), (8,0,0), (9,0,0), (10,0,0), (11,0,0), (12,0,0), (13,0,0), (14,0,0), (15,0,0)]

There are many ways, you could generate a new list like this:

data = [(0,1,1), (1,2,1), (2,4,3), (4,2,1)]

out = []
for i in range(16):
    if data and i == data[0][0]:
        out.append(data.pop(0))
    else:
        out.append((i, 0, 0))

print(out)
# [(0, 1, 1), (1, 2, 1), (2, 4, 3), (3, 0, 0), (4, 2, 1), 
# (5, 0, 0), (6, 0, 0), (7, 0, 0), (8, 0, 0), (9, 0, 0), 
# (10, 0, 0), (11, 0, 0), (12, 0, 0), (13, 0, 0), (14, 0, 0), (15, 0, 0)]

As a side note, I renamed your list data , as it is better to avoid using the names of builtin functions as variables.

You can convert the list of tuples to a dict indexed by the first items of the tuples first, so that you can iterate through the range of 0 to 15 to find the missing indices and produce default values for them in a list comprehension:

l = [(0,1,1), (1,2,1), (2,4,3), (4,2,1)]
d = {k: v for k, *v in l}
print([(i, *d.get(i, (0, 0))) for i in range(16)])

This outputs:

[(0, 1, 1), (1, 2, 1), (2, 4, 3), (3, 0, 0), (4, 2, 1), (5, 0, 0), (6, 0, 0), (7, 0, 0), (8, 0, 0), (9, 0, 0), (10, 0, 0), (11, 0, 0), (12, 0, 0), (13, 0, 0), (14, 0, 0), (15, 0, 0)]

In order to introduce some interesting language features, that can be used to solve this, here a solution based on Python's stable sort with a simple sort key function and itertools.groupby() to group the items based on the index.

You then take the first of two, if you had this index in the input data. Or the default item, if there was none with this index.

import itertools

received_items = [(0,1,1), (1,2,1), (2,4,3), (4,2,1)]
print(received_items)

default_items = [(i, 0, 0) for i in range(16)]

# append default items at the end
data = received_items + default_items
# perform stable sort on the index (first element) selected via key function
keyfunc = lambda x: x[0]
data.sort(key=keyfunc)
# always take first item for each index group
out = [next(v) for k,v in itertools.groupby(data, key=keyfunc)]

print(out)

The output is:

[(0, 1, 1), (1, 2, 1), (2, 4, 3), (4, 2, 1)]
[(0, 1, 1), (1, 2, 1), (2, 4, 3), (3, 0, 0), (4, 2, 1), (5, 0, 0), (6, 0, 0), (7, 0, 0), (8, 0, 0), (9, 0, 0), (10, 0, 0), (11, 0, 0), (12, 0, 0), (13, 0, 0), (14, 0, 0), (15, 0, 0)]

For reference:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM