简体   繁体   中英

How to find number of 010 in a certain range of a binary string

Given a binary string. How to find occurances of "010" within a certain range of the string.
For example, I have the string "0100110" . If the given range is 3 7 ( 1 based indexing ) then the output will be 4 . I could not find any faster way to solve it.

While trying this I can solve it in O(N) complexity. The approach is - firstly I point out the position of all '1' within the certain range and using those position I will figure out the number of '0' in both back and forth. And then multiply number of '0' found in back for a single '1' with number of '0' in found in forth. Then sum up the multiplied result for each of the '1' within the certain range.

For the given example the position of '1' within the range is {5, 6} . Now for index 5 I have number of '0' in both back and forth is 2 and 1 respectively. So we can make subsequence "010" is 2 . Similarly for index 6 we also get the answer is 2 . In total we can make the subsequence "010" is 4 times in total.

But when we have a number of Q queries of certain ranges for a given string then my approach easily reaches into the time complexity O(N 2 ) . I tried a lot but failed to find a way to optimize it. Can anybody help me with an approach that is less than O(N 2 ) complexity? Just to mention the time limit should be 1 Second . It will be a plus if you provide a pseudo code.

~Thanks in Advance.

Pretreatment: make auxiliary array containing cumulative number of zeros upto given position (with aux[0]=0)

  0 1 0 0 1 1 0  //string
0 1 1 2 3 3 3 4  //aux array A[]

For given L..R range scan for ones, for every k index of 1 get number of zeros in range - O(1) operation

P[k] = (A[k] - A[L-1]) * (A[R] - A[k])
S = Sum(P[k], k=L..R)

So we have O(RL) time per query and the worst case O(Q*N) for Q queries

But look at formula thoroughly:

P[k] = (A[k] - A[L-1]) * (A[R] - A[k]) = 
       A[k] * (A[R] + A[L-1]) - A[k]^2 - A[R] * A[L-1] = 
       A[k] * LRSum - A[k]^2 - LRProd
S = Sum(A[k] for ones) * LRSum - Sum(A[k]^2) - LRProd * NumOfOnes 

Note that LRSum and LRProd are constants for given query, and we have to calculate sums of A[k] for positions of ones and sum of squares for the same positions. Seems we can use the same idea of cumulative array and get result in O(1) per query.

Quick check gives (3+3)*5 - (9+9) - 4*2 = 30-18-8 = 4 for your example.

Using cumulative arrays:

  0 1 0 0 1  1  0  //string
0 1 1 2 3 3  3  4  //aux array A[]
0 0 1 1 1 4  7  7  //aux array B[]
0 0 1 1 1 10 19 19  //aux array C[]

Result = (B[R] - B[L-1]) * (A[R] + A[L-1]) - (C[R] - C[L-1]) - 
                 A[R] * A[L-1] * (R - L - 1 - (A[R] - A[L-1])) = 
         (7-1) * (4 + 1) - (19 - 1) - 4 * 1 * (7 - 2  - 4 + 1) = 4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM