简体   繁体   English

如果二进制列表的子序列超过给定长度,则保留它们

[英]Keep sub-sequences of a binary list if they surpass a given length

I want to create a function that takes as input a list (or numpy array) A and a number L. A is full of 0 and 1 and the goal is to keep the sub-sequences of 1 if they surpass L in length.我想创建一个 function,它以列表(或 numpy 数组)A 和数字 L 作为输入。A 充满了 0 和 1,目标是如果子序列的长度超过 L,则保持 1 的子序列。 I wrote a function to do it fix(A,L) but it takes to long to run so I wanted to know if their is a faster way of doing this.我写了一个 function 来修复它(A,L),但是运行需要很长时间,所以我想知道它们是否是一种更快的方法。

def fix(A,L):
    i=0
    while True:
        if i==len(A):
            return(A)
        if A[i]==1:
            s=0
            for j in range(i,len(A)):
                if A[j]==1:
                    s+=1
                    continue
                else:
                    if s>=L:
                        break
                    else:
                        A[i:j]=[0]*len(A[i:j])
                        break
            if A[j]==1 and s<L:
                A[i:j+1]=[0]*len(A[i:j+1])
            i=j+1
        else:
            i+=1
            continue

if I call fix([1,0,0,1,1,1,0,1,1,1,1,0,1,1,0,1], 3) it returns [0,0,0,1,1,1,0,1,1,1,1,0,0,0,0,0] which is the correct answer.如果我调用fix([1,0,0,1,1,1,0,1,1,1,1,0,1,1,0,1], 3)它返回[0,0,0,1,1,1,0,1,1,1,1,0,0,0,0,0]这是正确的答案。

You can use itertools.groupby and itertools.chain :您可以使用itertools.groupbyitertools.chain

def fix(A, L):
    from itertools import groupby, chain
    return list(chain.from_iterable(l if ((len(l:=list(g)) >= L and k) or not k) else [0]*len(l)
                                     for k, g in groupby(A)))
    
fix([1,0,0,1,1,1,0,1,1,1,1,0,1,1,0,1], 3)
# [0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0]
How it works这个怎么运作

groupby(A) will group per consecutive 0s or 1s. groupby(A)将按连续的 0 或 1 分组。 For each group we get the length and check if this is a group of 1s or 0s.对于每个组,我们获取长度并检查这是一组 1 还是 0。 If group of 0s or group of 1s of length ≥ L, we keep it, else we replace with a group of 0s of the same length.如果长度 ≥ L 的 0 组或 1 组,我们保留它,否则我们用相同长度的 0 组替换。 Finally, we chain everything to form a continuous list.最后,我们将所有内容chain起来形成一个连续的列表。

If you're working with 2D numpy arrays, what you want to achieve can be done using binary erosion and dilation.如果您使用的是 2D numpy arrays,则可以使用二进制腐蚀和膨胀来实现您想要实现的目标。 We can use scipy.ndimage.binary_erosion and binary_dilation我们可以使用scipy.ndimage.binary_erosionbinary_dilation

We're doing it here only on a single dimension:我们只在一个维度上这样做:

np.random.seed(0)
A = np.random.randint(0, 2, (10, 20))

from scipy.ndimage import binary_dilation, binary_erosion

L = 3
mask = np.ones((1, L))
binary_dilation(binary_erosion(A, mask), mask).astype(int)

example input:示例输入:

array([[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
       [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
       [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0],
       [1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0],
       [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0],
       [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1],
       [0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0],
       [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0]])

output: output:

array([[0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Visual input/output:视觉输入/输出:

输入 输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM