[英]How do I check if a string is made up exclusively of same-length character groups?
I want to identify strings that are made up exclusively of same-length character groups. 我想识别由相同长度的字符组组成的字符串。 Each one of these groups consists of at least two identical characters.
这些组中的每一个都包含至少两个相同的字符。 So, here are some examples:
所以,这里有一些例子:
aabbcc true
abbccaa false
xxxrrrruuu false (too many r's)
xxxxxfffff true
aa true (shortest possible positive example)
aabbbbcc true // I added this later to clarify my intention
@ilkkachu: Thanks for your remark concerning the repetition of the same character group. @ilkkachu:感谢您关于重复相同角色组的评论。 I added the example above.
我添加了上面的例子。 Yes, I want the last sample to be tested as true: a string made up of the two letter groups
aa, bb, bb, cc
. 是的,我希望最后一个样本被测试为真:一个由两个字母组
aa, bb, bb, cc
组成的字符串。
Is there a simple way to apply this condition-check on a string using regular expressions and JavaScript? 是否有一种简单的方法可以使用正则表达式和JavaScript对字符串应用此条件检查?
My first attempt was to do something like 我的第一次尝试是做类似的事情
var strarr=['aabbcc','abbccaa','xxxrrrruuu',
'xxxxxfffff','aa','negative'];
var rx=/^((.)\2+)+$/;
console.log(strarr.map(str=>str+': '+!!str.match(rx)).join('\n'));
It does look for groups of repeated characters but does not yet pay attention to these groups all being of the same length , as the output shows: 它确实会查找重复字符组,但是还没有 注意这些组的长度是否相同 ,如输出所示:
aabbcc: true
abbccaa: false
xxxrrrruuu: true // should be false!
xxxxxfffff: true
aa: true
aabbbbcc: true
negative: false
How do I get the check to look for same-length character groups? 如何检查以查找相同长度的字符组?
To get all the groups of the same character has an easy regex solution: 要获得相同角色的所有组,可以使用简单的正则表达式解决方案:
/(.)\1*/g
Just repeating the backreference \\1
of the character in capture group 1. 只需重复捕获组1中字符的反向引用
\\1
1。
Then just check if there's a length in the array of same character strings that doesn't match up. 然后检查数组中是否存在不匹配的相同字符串的长度。
Example snippet: 示例代码段:
function sameLengthCharGroups(str) { if(!str) return false; let arr = str.match(/(.)\\1*/g) //array with same character strings .map(function(x){return x.length}); //array with lengths let smallest_length = arr.reduce(function(x,y){return x < y ? x : y}); if(smallest_length === 1) return false; return arr.some(function(n){return (n % smallest_length) !== 0}) == false; } console.log("-- Should be true :"); let arr = ['aabbcc','xxxxxfffff','aa']; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)}); console.log("-- Should also be true :"); arr = ['aabbbbcc','224444','444422', '666666224444666666','666666444422','999999999666666333']; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)}); console.log("-- Should be false :"); arr = ['abbcc','xxxrrrruuu','a','ab','',undefined]; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)});
ECMAScript 6 version with fat arrows (doesn't work in IE) 带有胖箭头的ECMAScript 6版本(在IE中不起作用)
function sameLengthCharGroups(str)
{
if(!str) return false;
let arr = str.match(/(.)\1*/g)
.map((x) => x.length);
let smallest_length = arr.reduce((x,y) => x < y ? x : y);
if(smallest_length === 1) return false;
return arr.some((n) => (n % smallest_length) !== 0) == false;
}
Or using exec instead of match, which should be faster for huge strings. 或者使用exec代替匹配,对于大字符串应该更快。
Since it can exit the while loop as soon a different length is found. 因为它可以在找到不同长度后立即退出while循环。
But this has the disadvantage that this way it can't get the minimum length of ALL the lengths before comparing them. 但这样做的缺点是,在比较它们之前,它无法获得所有长度的最小长度。
So those with the minimum length at the end can't be found as OK this way. 所以那些最后长度最小的人就不会这样。
function sameLengthCharGroups(str) { if(!str) return false; const re = /(.)\\1*/g; let m, smallest_length; while(m = re.exec(str)){ if(m.index === 0) {smallest_length = m[0].length} if(smallest_length > m[0].length && smallest_length % m[0].length === 0){smallest_length = m[0].length} if(m[0].length === 1 || // m[0].length !== smallest_length (m[0].length % smallest_length) !== 0 ) return false; } return true; } console.log("-- Should be true :"); let arr = ['aabbcc','xxxxxfffff','aa']; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)}); console.log("-- Should also be true :"); arr = ['aabbbbcc','224444','444422', '666666224444666666','666666444422','999999999666666333']; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)}); console.log("-- Should be false :"); arr = ['abbcc','xxxrrrruuu','a','ab','',undefined]; arr.forEach(function(s){console.log(sameLengthCharGroups(s)+' : '+ s)});
Here's one that runs in linear time: 这是一个在线性时间运行的:
function test(str) {
if (str.length === 0) return true;
let lastChar = str.charAt(0);
let seqLength = 1;
let lastSeqLength = null;
for (let i = 1; i < str.length; i++) {
if (str.charAt(i) === lastChar) {
seqLength++;
}
else if (lastSeqLength === null || seqLength === lastSeqLength) {
lastSeqLength = seqLength;
seqLength = 1;
lastChar = str.charAt(i);
}
else {
return false;
}
}
return (lastSeqLength === null || lastSeqLength === seqLength);
}
Using sticky flag y
and replace
method you could do this much more faster. 使用粘性标记
y
和replace
方法,您可以更快地完成此操作。 This trick replaces occurrences of first one's length with an empty string (and stops as soon as an occurrence with different length happens) then checks if there are some characters left: 这个技巧用空字符串替换第一个长度的出现(并且一旦出现不同长度的事件就停止)然后检查是否还剩下一些字符:
var words = ['aabbcc', 'abbccaa', 'xxxrrrruuu', 'xxxxxfffff', 'aa']; words.forEach(w => { console.log(w + " => " + (w.replace(/(.)\\1+/gy, ($0, $1, o) => { return $0.length == (o == 0 ? l = $0.length : l) ? '' : $0; }).length < 1)); });
Another workaround would be using replace()
along with test()
. 另一种解决方法是使用
replace()
和test()
。 First one replaces different characters with their corresponding length and the second looks for same repeated numbers in preceding string: 第一个用相应的长度替换不同的字符,第二个在前面的字符串中查找相同的重复数字:
var str = 'aabbc';
/^(\d+\n)\1*$/.test(str.replace(/(.)\1+/gy, x => x.length + '\n'));
Demo: 演示:
var words = ['aabbcc', 'abbccaa', 'xxxrrrruuu', 'xxxxxfffff', 'aa']; words.forEach(w => console.log(/^(\\d+\\n)\\1*$/.test(w.replace(/(.)\\1+/gy, x => x.length + '\\n'))) );
Since requirements changed or weren't clear as now this is the third solution I'm posting. 由于要求已经改变或者现在不明确,这是我发布的第三个解决方案。 To accept strings that could be divided into smaller groups like
aabbbb
we could: 要接受可以分为像
aabbbb
这样的较小组的字符串,我们可以:
2
and 4
in this case. 2
和4
。 d
. d
的数组中。 m
. m
集合中的最小长度。 d
have no remainder when divided by m
d
所有值除以m
时是否没有余数 Demo 演示
var words = ['aabbbcccdddd', 'abbccaa', 'xxxrrrruuu', 'xxxxxfffff', 'aab', 'aabbbbccc']; words.forEach(w => { var d = [], m = Number.MAX_SAFE_INTEGER; var s = w.replace(/(.)\\1+/gy, x => { d.push(l = x.length); if (l < m) m = l; return ''; }); console.log(w + " => " + (s == '' && !d.some(n => n % m != 0))); });
The length of the repeated pattern of same charcters needs to be specified within the regular expression. 需要在正则表达式中指定相同字符的重复模式的长度。 The following snippet creates regular expressions looking for string lengths of 11 down to 2. The for-loop is exited once a match is found and the function returns the length of the pattern found:
以下代码段创建正则表达式,查找字符串长度为11到2。一旦找到匹配项就退出for循环,函数返回找到的模式的长度:
function pat1(s){
for (var i=10;i;i--)
if(RegExp('^((.)\\2{'+i+'})+$').exec(s))
return i+1;
return false;}
If nothing is found false
is returned. 如果没有找到,则返回
false
。
If the length of the pattern is not required, the regular expression can also be set up in one go (without the need of the for loop around it): 如果不需要模式的长度,也可以一次性设置正则表达式(不需要围绕它的for循环):
function pat2(s){
var rx=/^((.)\2)+$|^((.)\4{2})+$|^((.)\6{4})+$|^((.)\8{6})+$/;
return !!rx.exec(s);
}
Here are the results from both tests: 以下是两个测试的结果:
console.log(strarr.map(str=>
str+': '+pat1(str)
+' '+pat2(str)).join('\n')+'\n');
aabbcc: 2 true
abbccaa: false false
xxxrrrruuu: false false
xxxxxfffff: 5 true
aa: 2 true
aabbbbcc: 2 true
negative: false false
The regex in pat2 looks for certain repetition-counts only. pat2中的正则表达式仅查找某些重复计数。 When 1, 2, 4 or 6 repetitions of a previous character are found then the result is positive.
当找到前一个字符的1,2,4或6次重复时,结果为正。 The found patterns have lengths of 2,3,5 or 7 characters (prime numbers!).
找到的模式长度为2,3,5或7个字符(素数!)。 With these length-checks any pattern-length dividable by one of these numbers will be found as positive (2,3,4,5,6,7,8,9,10,12,14,15,16,18,20,21,22,24,...).
通过这些长度检查,任何可由这些数字之一划分的模式长度都将被发现为正数(2,3,4,5,6,7,8,9,10,12,14,15,16,18,20) ,21,22,24,......)。
Since regex has never been my forte here's an approach using String#replace()
to add delimiter to string at change of letter and then use that to split into array and check that all elements in array have same length 由于正则表达式从未成为我的强项,因此使用
String#replace()
在字母更改时将字符串添加到字符串然后使用它分割成数组并检查数组中的所有元素是否具有相同的长度
const values = ['aabbcc', 'abbccaa', 'xxxrrrruuu', 'xxxxxfffff', 'aa']; const expect = [true, false, false, true, true]; const hasMatchingGroups = (str) => { if(!str || str.length %2) return false; const groups = str.replace(/[az]/g,(match, offset, string) => { return string[offset + 1] && match !== string[offset + 1] ? match + '|' : match; }).split('|'); return groups.every(s => s.length === groups[0].length) } values.forEach((s, i) => console.log(JSON.stringify([s,hasMatchingGroups(s), expect[i]])))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.