[英]Maximum repeating sequence instead of longest repeating sequence
I am trying to get the most repeated sequence of characters in a string.我正在尝试获取字符串中重复次数最多的字符序列。 For example :
例如 :
Input:输入:
s = "abccbaabccba"
Output:输出:
2
I have used dynamic programming to figure out the repeating sequence, but this returns the longest repeating character sequence.我使用动态编程来找出重复序列,但这会返回最长的重复字符序列。 For example:
例如:
Input:输入:
s = "abcabcabcabc"
Output:输出:
2
2(abcabc,abcabc) instead of 4(abc,abc,abc,abc)
Here is the part of the code where I'm filling the DP table and extracting repeating sequence.这是我填充 DP 表并提取重复序列的代码部分。 Can anyone suggest how I can get the most repeating sequence?
谁能建议我如何获得最多重复的序列?
//Run through the string and fill the DP table.
char[] chars = s.toCharArray();
for(int i = 1; i <= length; i++){
for(int j = 1; j <= length; j++){
if( chars[i-1] == chars[j-1] && Math.abs(i-j) > table[i-1][j-1]){
table[i][j] = table[i-1][j-1] + 1;
if(table[i][j] > max_length_sub){
max_length_sub = table[i][j];
array_index = Math.min(i, j);
}
}else{
table[i][j] = 0;
}
}
}
//Check if there was a repeating sequence and return the number of times it occurred.
if( max_length_sub > 0 ){
String temp = s;
String subSeq = "";
for(int i = (array_index - max_length_sub); i< max_length_sub; i++){
subSeq = subSeq + s.charAt(i);
}
System.out.println( subSeq );
Pattern pattern = Pattern.compile(subSeq);
Matcher matcher = pattern.matcher(s);
int count = 0;
while (matcher.find())
count++;
// To find left overs - doesn't seem to matter
String[] splits = temp.split(subSeq);
if (splits.length == 0){
return count;
}else{
return 0;
}
}
Simple and dump, the the smallest sequence to be considered is a pair of characters (*):简单和转储,要考虑的最小序列是一对字符(*):
for
and substring
to get the characters;for
和substring
来获取字符;countOccurrences()
using indexof(String, int)
or regular expressions;indexof(String, int)
或正则表达式创建一个方法countOccurrences()
; andmaxCount
outside the loop and an if
to check if the actual count is greater (or Math.max()
)maxCount
和一个if
来检查实际计数是否更大(或Math.max()
) (*) if "abc" occurs 5 times, than "ab" (and "bc") will occur at least 5 times too - so it is enough to search just for "ab" and "bc", not need to check "abc" (*) 如果 "abc" 出现 5 次,那么 "ab"(和 "bc")也至少会出现 5 次 - 所以只搜索 "ab" 和 "bc" 就足够了,不需要检查 " ABC”
Edit without leftovers, see comments, summary:编辑没有剩菜,见评论,总结:
check if the first character is repeated over the whole string, if not检查第一个字符是否在整个字符串中重复,如果不是
check if the 2 initial characters are repeated all over, if not检查 2 个初始字符是否全部重复,如果没有
check if the 3 ...检查是否 3 ...
at least 2 counters/loops needed: one for the number of characters to test, second for the position being tested.至少需要 2 个计数器/循环:一个是要测试的字符数,第二个是被测试的位置。 Some arithmetic could be used to improve performance: the length of the string must be divisible by the number of repeated characters without remainder.
可以使用一些算术来提高性能:字符串的长度必须能被重复字符的数量整除而没有余数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.