简体   繁体   中英

Python Combine Repeating Elements

I have a list of stings that have some repeating elements that I want to combine into a shorter list.

The original list contents look something like this:

lst = [['0.1', '0', 'RC', '100'],
        ['0.2', '10', 'RC', '100'],
        ['0.3', '5', 'HC', '20'],
        ['0.4', '5', 'HC', '20'],
        ['0.5', '5', 'HC', '20'],
        ['0.6', '5', 'HC', '20'],
        ['0.7', '5', 'HC', '20'],
        ['0.8', '5', 'HC', '20'],
        ['0.9', '10', 'RC', '100'],
        ['1.0', '0', 'RC', '100']]

After running it through the function it would become:

lst = [['0.1', '0', 'RC', '100'],
        ['0.2', '10', 'RC', '100'],
        ['0.3', '5', 'HC', '20'],
        ['0.9', '10', 'RC', '100'],
        ['1.0', '0', 'RC', '100']]

The list will always have this general structure, so essentially I want to combine the list based on whether or not the last 3 columns are exactly the same.

I want it to be a callable function so it would look some thing like:

def combine_list(lst):
    if sublist[1:3] == next_sublist[1:3]:
        let.remove(next_sublist)

My initial research on this showed many methods to remove a sublist based on its index, but that is not necessarily known before hand. I also found the re module, however I have never used it and unsure on how to implement it. Thank you in advanced

If you want to remove sub lists that are the same for the last three elements and consecutive , you would need itertools.groupby keyed on the last three elements:

from itertools import groupby
[next(g) for _, g in groupby(lst, key=lambda x: x[1:])]

#[['0.1', '0', 'RC', '100'],
# ['0.2', '10', 'RC', '100'],
# ['0.3', '5', 'HC', '20'],
# ['0.9', '10', 'RC', '100'],
# ['1.0', '0', 'RC', '100']]

Maybe just use a set to keep track of duplicates?

def combine_list(lst):
    out = []
    seen = set()
    for item in lst:
        if not tuple(item[1:]) in seen:
            out.append(item)
            seen.add(tuple(item[1:]))
    return out

Lists are a mutable data structure. And so there is no guarantee that the contents of a list does not change over time. That means it cannot be used in a hashing function (which the set uses). The tuple, on the other hand, is immutable, and hence hashable.

for index in range(len(lst) - 1, 0, -1):
    if lst[index][1:] == lst[index - 1][1:]:
        lst.pop(index)

By going through the list backwards, we remove the problems with indices changing when we remove elements. This results in an in-place reduction.

If you'd like to make a new list, this can be done via list comprehension following the same idea, but since we're not doing it in place, we don't have to work in reverse:

lst[0] + [lst[ind] for ind in range(1, len(lst)) if lst[ind][1:] != lst[ind-1][1:]]

Again, lst[0] is trivially non-duplicate and therefore automatically included.

def combine_list(ls):
    cpy = ls[:]

    for i, sub in enumerate(ls[:len(ls) - 1]):
        if sub[1:] == ls[i + 1][1:]:
            cpy.remove(ls[i + 1])

    return cpy

This function should work. It creates a new copy of the list, to avoid modifying the original. Then it iterates over the original list (except the last value), as that stays the same.

It then checks if the last values of the list are equal to the last values of the next list. If they are, the next list is deleted.

The function then returns the new list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM