简体   繁体   English

按值从多个列表中删除发生的元素(缩短多个列表)

[英]Remove occuring elements from multiple lists (shorten multiple lists) by value

Say I have a list of lists: 说我有一个列表列表:

[[0,0,0,1,2,3],[0,0,0,4,5,6],[0,0,0,0,7,8],[0,0,0,0,0,9]]

I want to end up with a list that would have removed common null/zero/keyword from each lists within that list to yield an output desired: 我希望最终得到一个列表,该列表将从该列表中的每个列表中删除常见的null / zero / keyword以产生所需的输出:

[[1,2,3],[4,5,6],[0,7,8],[0,0,9]]

Obviously, looping through every list within that list and then comparing it against all the other lists is beyond an ideal answer. 显然,循环遍历该列表中的每个列表,然后将其与所有其他列表进行比较,这是一个理想的答案。 Thanks. 谢谢。

If you were to sort those sublists, you would find that the maximum one would have the number of zeroes you need to drop from all of them. 如果您要这些子列表进行排序 ,您会发现最大的子列表将需要从所有这些子列表中删除的零数。 So just find the max : 所以只需找到max

x = [[0,0,0,1,2,3],[0,0,0,4,5,6],[0,0,0,0,7,8],[0,0,0,0,0,9]]

max(x)
Out[2]: [0, 0, 0, 4, 5, 6]

figure out how many leading zeroes you need to drop: 弄清楚你需要丢弃多少个前导零:

from itertools import takewhile

#needlessly pedantic way of doing this
num_zeroes = len(list(takewhile(lambda p: p == 0, max(x))))

and slice accordingly: 并相应切片:

[li[num_zeroes:] for li in x]
Out[12]: [[1, 2, 3], [4, 5, 6], [0, 7, 8], [0, 0, 9]]

Obviously, looping through every list within that list and then comparing it against all the other lists is beyond an ideal answer. 显然,循环遍历该列表中的每个列表,然后将其与所有其他列表进行比较,这是一个理想的答案。

Well, there's really no way around comparing the prefix to the prefix of every list. 好吧,没有办法将前缀与每个列表的前缀进行比较。

But you can avoid comparing each entire list to every list. 但是您可以避免将每个列表与每个列表进行比较。 In other words, you can make this O(NM), where M is the length of the common prefix, instead of O(N**2). 换句话说,你可以使这个O(NM),其中M是公共前缀的长度,而不是O(N ** 2)。 Just do it in two passes, keeping track of the longest prefix seen so far during the first pass, then using the result in the second pass. 只需要两次传递,跟踪到目前为止在第一次传球中看到的最长前缀,然后在第二次传球中使用结果。

Alternatively, we can make it more explicit, calculating the nonzero prefix for each list with a max value. 或者,我们可以使其更明确,为每个列表计算具有最大值的非零前缀。 It should be obvious that this is the same number of steps (although it will be slower by a small constant, because it does the inner loop in Python instead of in C): 很明显,这是相同数量的步骤(虽然它会以较小的常量变慢,因为它在Python中而不是在C中执行内部循环):

def first_nonzero(seq, stop=None):
    for i, val in enumerate(seq):
        if val or i == stop:
            return i
    return i

prefix = None
for lst in list_o_lists:
    prefix = first_nonzero(lst, prefix)

output = [lst[prefix:] for lst in list_o_lists]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM