简体   繁体   中英

What is a Pythonic way to remove doubled duplicates in a list but allow triplets/greater?

I have a list in Python

list1 = [0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1]

There are "groupings" of consecutive 1's and 0's. For my purposes, I am only interested in the consecutive 1's. Let's say if there is a solitary 1, eg

 ... 0, 0, 1, 0, 0 ...

I would like it to be changed into a 0. Similarly, if there are only pairs of 1's, these should become 0s as well. eg

.... 0, 0, 1, 1, 0, 0...

should become

.... 0, 0, 0, 0, 0, 0...

But "triplets" of consecutive ones or higher are ok.

I considered keeping track of the counts of 1s via a dictionary, but this feels too complex. The idea would be to iterate through the list, keeping track of the count of consecutive 1s in a list. But, how do you "go back" and switch the 1s to 0s?

counter_list = []
for i in list1: 
    if i == 1:
        counter_list = counter_list + 1
            if len(counter_list)==1 or len(counter_list)==2:
                # now I don't know how to "go back" to change these elements into 0s

This is an erosion followed by a dilation , common operations in computer vision:

>>> from scipy.ndimage.morphology import binary_dilation, binary_erosion
>>> print(list1)
[0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1]
>>> print(binary_dilation(binary_erosion(list1)).astype(int))
[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]

The composition of the two operations is called an opening :

>>> from scipy.ndimage.morphology import binary_opening
>>> print(binary_opening(list1).astype(int))
[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]

You can use methods from image processing for this. The morphological operation "opening" with the structure element [1 1 1]should do the job. Should be a 1-liner.

itertools.groupby can help by grouping the list into runs of 0s or 1s. From there, we can use the length of a group of 1s to decide whether to switch it to 0s, and use itertools.chain.from_iterable to fuse the groups back into one stream:

import itertools

groups = ((key, list(group)) for key, group in itertools.groupby(list1))

fixed_groups = (group if key==0 or len(group)>2 else [0]*len(group)
                for key, group in groups)

result = list(itertools.chain.from_iterable(fixed_groups))

Using a simple for-loop. Not sure it is the most efficient way, but it does the trick.

list1 = [0,0,1,1,0,0,1,0,1,1,1,0,0,1,0,1,1,1]

counter_list = []
for i,elem in enumerate(list1):
    if elem==1:
        counter_list.append(i)
    else:
        if len(counter_list)>0 and len(counter_list)<=2:
            for e in counter_list:
                list1[e] = 0
        counter_list= []
print(list1)

You could turn it into a string and use a regular expression to substitute short runs of '1' with equivalent runs of '0' , then turn it back into a list:

>>> import re
>>> list1 = [0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1]
>>> ''.join(map(str, list1))
'0001110011000111111'
>>> re.sub(r'(?<!1)1{,2}(?!1)', lambda x: len(x.group())*'0', _)
'0001110000000111111'
>>> list(map(int, _))
[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM