简体   繁体   中英

How to count the number of ways of choosing of k equal substrings from a List L(the list of All Substrings)

Given a string S consisting of N lowercase English alphabets.

Suppose we have a list L consisting of all non empty substrings of the string S .

I need to count the number of ways to choose exactly K equal strings from the list L (note that length of substring is not necessary to be equal to k ). 1 ≤ N ≤ 5000 1 ≤ K ≤ 10^9

Exampple:

Let S=ababa. 

As List L = {"a", "b", "a", "b", "a", "ab", "ba", "ab", "ba", "aba", "bab", "aba", "abab", "baba", "ababa"}

let k=2

The number of ways will be 7:

("a", "a")
("a", "a") 
("a", "a") 
("b", "b") 
("ab", "ab") 
("ba", "ba") 
("aba", "aba")

Similarly:

let k=3

The no of ways will be 1:

("a", "a", "a")

"A list of all substrings". Why would you have a list of all substrings? Let's say you have a string of one million characters, there are 500 billion substrings. The list of all substrings is not at all needed to solve the problem.

If K = 0 then there is one way. If K = 1 then there are N ways.

For k = 1 to N, each substring of length k can start at an index from 0 to N - k, that's N - k + 1 substrings. Identify the different strings and count how many there are of each using a hash table. Then for each different string that occurs n times, n >= k, add (n over K) to your count.

That's it.

You can do it faster by looking at strings of length 1 first, ignore all those where you have less than K equal strings, count the number of ways, then add another character to each and repeat. Say K = 5, you had a million characters in the string, and only two substrings of length 6 that occurred five or more times, then you only need to add characters to these two substrings.

Build suffix array for given string.

Walk for this array, look for common starting symbols of (at least k) neighbour siffixes.

Here's something in JavaScript:

function choose(n,k){
 if(k>n)return 0;if(k==0||n==k)return 1;var p=n;for(var i=2;i<=k;i++)p*=(n+1-i)/i;return p;
}

function f(str,k){
  var n = str.length,
      h = {},
      count = 0;

  for (var i=0; i<n; i++){
    var s = "";
    for (var j=i; k <= n - j + i && j < n; j++){
      s += str.charAt(j);
      if (h[s])
        h[s]++;
      else
        h[s] = 1;
    }
  }

  for (var i in h)
    count += choose(h[i],k);

  return count;
}

Output:

console.log(f("ababa",2));
console.log(f("ababa",3));

7
1

As other people have noticed, you don't really need the list of substrings. Because you only care about equal substrings, you only need to count how many times a substring appears, and can use a hash/dictionary/map to keep track of that. Then, the number of ways to choose exactly k equal substrings when a substring appears n times is the binomial coefficient c(n,k) . You can add up all of those binomial coefficients for each different substring, and you have your answer.

Notice that if you are asking this question for multiple k values, you only need to build the hash/dictionary/map once.

Without any specifics about the languages you're learning in, I believe you can accomplish this with a simple nested loop. Just compare each value to all values in the array or list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM