简体   繁体   English

检查连续子序列的快速方法

[英]Fast way to check consecutive subsequences for total

I have a list (up to 10,000 long) of numbers 0, 1, or 2. I need to see how many consecutive subsequences have a total which is NOT 1. My current method is to for each list do:我有一个数字 0、1 或 2 的列表(最多 10,000 长)。我需要查看有多少连续子序列的总数不是 1。我目前的方法是对每个列表执行以下操作:

cons = 0
for i in range(seqlen+1):
    for j in range(i+1, seqlen+1):
        if sum(twos[i:j]) != 1:
            cons += 1

So an example input would be:所以一个示例输入是:

[0, 1, 2, 0]

and the output would be和 output 将是

cons = 8

as the 8 working subsequences are:因为 8 个工作子序列是:

[0] [2] [0] [1,2] [2, 0] [0, 1, 2] [1, 2, 0] [0, 1, 2, 0]

The issue is that simply going through all these subsequences (the i in range, j in range) takes almost more time than is allowed, and when the if statement is added, the code takes far too long to run on the server.问题是简单地遍历所有这些子序列(范围内的 i,范围内的 j)几乎比允许的时间多,并且当添加 if 语句时,代码在服务器上运行的时间太长了。 (To be clear, this is only a small part of a larger problem, I'm not just asking for the solution to an entire problem). (需要明确的是,这只是一个更大问题的一小部分,我不只是要求解决整个问题)。 Anyway, is there any other way to check faster?无论如何,有没有其他方法可以更快地检查? I can't think of anything that wouldn't result in more operations needing to happen every time.我想不出任何不会导致每次都需要进行更多操作的事情。

I think I see the problem: your terminology is incorrect / redundant.我想我看到了问题:您的术语不正确/多余。 By definition, a sub-sequence is a series of consecutive elements.根据定义,子序列是一系列连续的元素。

Do not sum every candidate.不要总结每个候选人。 Instead, identify every candidate whose sum is 1 , and then subtract that total from the computed quantity of all sub-sequences (simple algebra).相反,识别总和1的每个候选者,然后从所有子序列的计算数量中减去该总数(简单代数)。

All of the 1-sum candidates are of the regular expression form 0*10* : a 1 surrounded by any quantity of 0 s on either or both sides.所有的 1-sum 候选者都是正则表达式形式0*10* :一个1被任意数量的0包围在任一侧或两侧。

Identify all such maximal-length strings.识别所有这样的最大长度字符串。 FOr instance, in例如,在

210002020001002011

you will pick out 1000 , 000100 , 01 , and 1 .您将选择1000000100011 For each string compute the quantity of substrings that contain the 1 (a simple equation on the lengths of the 0 s on each side).对于每个字符串,计算包含1的子字符串的数量(关于每边0长度的简单方程)。 Add up those quantities.把这些数量加起来。 Subtract from the total for the entire input.从整个输入的总数中减去。 There's you answer.你有答案。

Use sliding window technique to solve these type of problem.使用滑动 window 技术来解决这类问题。 Take two variable to track first and last to track the scope of window.取两个变量先跟踪,最后跟踪 window 的 scope。 So you start with sum equal to first element.所以你从总和等于第一个元素开始。 If the sum is larger than required value you subtract the 'first' element from sum and increment sum by 1. If the sum is smaller than required you add next element of 'last' pointer and increment last by 1. Every time sum is equal to required increment some counter.如果总和大于所需值,则从总和中减去“第一个”元素并将总和加 1。如果总和小于所需值,则添加“最后”指针的下一个元素并将最后一个元素加 1。每次总和都相等需要增加一些计数器。

As for NOT, count number of sub-sequence having '1' sum and then subtract from total number of sub-sequence possible, ie n * (n + 1) / 2至于NOT,计算总和为'1'的子序列数,然后从可能的子序列总数中减去,即n * (n + 1) / 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM