简体   繁体   English

高效算法,用于计算可被6整除的子序列数

[英]Efficient Algorithm to count number of subsequences divisible by 6

Given a string of decimal digits, I have to find the number of all subsequences divisible by 6. 给定一串十进制数字,我必须找到可被6整除的所有子序列的数量。

1 ≤ value of String ≤ 10^6

I've tried the naive approach of iterating over all possible subsequences and obtaining the answer, but that is not fast enough, especially with such a huge upper bound on the string length. 我已经尝试了一种天真的方法来迭代所有可能的子序列并获得答案,但这还不够快,特别是在字符串长度上有如此巨大的上限。 Then I tried a DP approach but was unable to code DP solution for given range. 然后我尝试了DP方法,但无法为给定范围编写DP解决方案。 Can someone please provide any lead in this Problem? 有人可以在这个问题上提供任何线索吗?

Sample Input
1232
Output
3
Strings Possible - 12,12,132
//Ans should be modulo 10^9 + 7

Below is the DP code(not completely sure about it) for finding the total number of subsequences divisible by 3.Now to check for 6, we also need to incorporate the divisibility by 2 which is creating problem for me. 下面是DP代码(不完全确定),用于查找可被3整除的子序列的总数。现在要检查6,我们还需要将可分性2加入,这对我来说是个问题。

for(i=0 ; i<n ; i++) {
    for(j=0 ; j<3 ; j++) {
        dp[i][j]=0 ;
    }
    int dig = (str[i]-'0')%3 ;
    dp[i][dig]++ ;
    if(i>0) {
        for(j=0 ; j<3 ; j++) {
            if(dig % 3 == 0) { 
               dp[i][j] += dp[i-1][j];
             }
            if(dig % 3 == 1) {
               dp[i][j] += dp[i-1][(j+2)%3];
             }
            if(dig % 3 == 2) {
               dp[i][j] += dp[i-1][(j+1)%3];
             }
        }
    }
}
long long ans = 0;
for(i=0 ; i<n ; i++) { 
    ans += dp[i][0] ;
}
return ans;

Let SS(x, k, m) = the number of subsequences of the string x representing a number that's equal to k modulo m . SS(x, k, m) =字符串x的子序列的数量,表示等于k modulo m

SS([], k, m) = 1 if k == 0 otherwise 0   -- (see footnote at end)
SS(x + [d], k, m) = SS(x, k, m) + sum(SS(x, j, m) where j*10+d == k modulo m)

That is, if you add a digit to x, then the subsequences that sum to k are subsequences of x that sum to k, plus subsequences of x that sum to j where (10*j) plus the new digit is k modulo m. 也就是说,如果向x添加一个数字,那么总和为k的子序列是x的子序列,其总和为k,加上x的子序列,其总和为j,其中(10 * j)加上新数字是k模数m。

That turns into a nice dynamic program, which if N is the length of the string and m the number you want subsequences to be divisible by, runs in O(Nm + m^2) time and uses O(m) space. 这变成了一个很好的动态程序,如果N是字符串的长度,并且m是你想要子序列可被整除的数字,则在O(Nm + m ^ 2)时间运行并使用O(m)空间。 For m=6, this is O(N) time and O(1) space. 对于m = 6,这是O(N)时间和O(1)空间。

# count subsequences with a sum divisible by m.
def subseq(N, m):
    a = [1] + [0] * (m - 1)
    indexes = [[j for j in xrange(m) if (10*j-i)%m == 0] for i in xrange(m)]
    for digit in N:
        a = [a[i] + sum(a[j] for j in indexes[(i - digit) % m]) for i in xrange(m)]
    return a[0] - 1

print subseq(map(int, '1232'), 6)

footnote: the definition of SS counts the empty list as 0 but the empty string isn't a valid number, so the function subtracts one before returning. 脚注: SS的定义将空列表计为0,但空字符串不是有效数字,因此函数在返回之前减去一个。

This problem can be solved in linear time , O(N) , and linear space O(N) ,N being length of string if we are two consider only substrings. 这个问题可以在线性时间, O(N)和线性空间O(N)中解决 ,如果我们两个只考虑子串,则N是字符串的长度。 I am trying to build an algorithm for subsequences. 我正在尝试为子序列构建算法。

Key points : 要点

1 . 1 All the substrings that are divisible by 6 are divisible by 2 and 3 and we will focus on divisibility by these two numbers. 所有可被6整除的子串都可以被2和3整除,我们将通过这两个数来关注可分性。

2 . 2 This means all candidate substrings must end with either 0 or 2 or 4 or 6 or 8, to satisfy divisibility by 2 AND 这意味着所有候选子串必须以0或2或4或6或8结束,以满足2 AND的可分性

3 . 3 Sum of digits of the substring must be divisible by 3. 子串的位数之和必须可以被3整除。

Now first we take an array arr , of length N. We fill is such that 现在我们首先采用长度为N的数组arr 。我们填充就是这样

arr[i] = 1 , if ith digit in substring is 0 or 2 or 4 or 6 or 8.

else  arr[i] = 0.

This can be easily done in single traversal of the string. 这可以通过单个遍历字符串轻松完成。

What we achieve is now we know that all are candidate substrings will end at index i of string such that arr[i] = 1 , because we have to satisfy divisibility by 2. 我们现在知道的是,所有候选子串都将以字符串的索引i结束,因此arr[i] = 1 ,因为我们必须满足2的可除性。

Now take another array arr1 ,initialized to 0 for all indexes.We fill it such that 现在取另一个数组arr1 ,为所有索引初始化为0.我们填充它

arr1[i] = 1, only if sum of digits from index 0 to index i is divisible by 3 

or from index j to i is divisible by 3, such that j < i.

else arr1[i] = 0

For filling up of array arr1 , algorithm is as follows: 为了填充数组arr1 ,算法如下:

sum = 0
for(i = 0 to length of string - 1)
{
 sum = sum + digit at index i;
 if(sum%3 == 0)
 {
  arr1[i] = 1 
  sum = 0
 }
}

Now we must take care of the fact even if sum of digits from 0 to index i is divisible by 3, it is possible that sum of digits is also divisible by 3 from index j to i , such that 0 < j < i . 现在我们必须注意这个事实,即使从0到索引i的数字之和可被3整除,数字之和也可能从索引ji可被3整除,使得0 < j < i

For this we need another array, that keeps track of how many such substrings we have found till yet. 为此,我们需要另一个数组,它跟踪我们到目前为止已经发现了多少个这样的子串。

Let the array be track , such that 让数组track ,这样

track[i] = x, if there are x number of 1's in array arr1 for indices j < i.

We don't need another traversal we can modify our previous algorithm as: 我们不需要另外遍历,我们可以修改我们以前的算法:

initialize array track to be 0 for all entries.
sum = 0
found = -1
for(i = 0 to length of string - 1)
{
 sum = sum + digit at index i;
 if(sum%3 == 0)
  {
    arr1[i] = 1 
    ++found
    track[i] = found 
    sum = 0
}

Now comes the important part which is counting, 现在是计数的重要部分,

Claim : 声明

A substring ending at index i will only contribute to count iff: 以索引i结尾的子字符串只会对iff计数:

arr[i] == 1 and arr1[i] == 1

It is clear because we have to satisfy divisibility by both 2 and 3. And the contribution towards count would be: 很明显,因为我们必须满足2和3的可分性。对计数的贡献是:

count = count + track[i] + 1

1 is added because of j < i in 由于j < i in,因此添加了1

track[i] = x, if there are x number of 1's in array arr1 for indices j < i.

The algorithm is fairly easy to implement, take that up as an exercise. 该算法相当容易实现,并将其作为练习。

Exponential (for general case) recursive solution which translates to linear if the max value that match can represent is 1e6. 指数(对于一般情况)递归解决方案,如果匹配的最大值可以表示为1e6,则转换为线性。

def recurse(x, substr, input):
   if x%6 == 0:
     print(x)
   if len(substr) == 6: // as the value represented by string may not be > 1e6
     return
   if input:
     recurse(x+input[0], substr + input[0], input[1:]) // grow the "window"
     recurse(x, substr, input[1:]) // shift the "window"

input = "123163736395067251284059573634848487474"

recurse(input)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM