[英]Regex split on upper case and first digit
I need to split the string "thisIs12MyString"
to an array looking like [ "this", "Is", "12", "My", "String" ]
我需要将字符串
"thisIs12MyString"
拆分为一个看起来像[ "this", "Is", "12", "My", "String" ]
I've got so far as to "thisIs12MyString".split(/(?=[A-Z0-9])/)
but it splits on each digit and gives the array [ "this", "Is", "1", "2", "My", "String" ]
我已经到了
"thisIs12MyString".split(/(?=[A-Z0-9])/)
但它在每个数字上分裂并给出数组[ "this", "Is", "1", "2", "My", "String" ]
So in words I need to split the string on upper case letter and digits that does not have an another digit in front of it. 所以在单词中我需要将字符串拆分为大写字母和数字,而前面没有另一个数字。
Are you looking for this? 你在找这个吗?
"thisIs12MyString".match(/[A-Z]?[a-z]+|[0-9]+/g)
returns 回报
["this", "Is", "12", "My", "String"]
As I said in my comment, my approach would be to insert a special character before each sequence of digits first, as a marker : 正如我在评论中所说,我的方法是先在每个数字序列之前插入一个特殊字符作为标记 :
"thisIs12MyString".replace(/\d+/g, '~$&').split(/(?=[A-Z])|~/)
where ~
could be any other character, preferably a non-printable one (eg a control character), as it is unlikely to appear "naturally" in a string. 其中
~
可以是任何其他字符,最好是不可打印的字符(例如控制字符),因为它不太可能在字符串中“自然地”出现。
In that case, you could even insert the marker before each capital letter as well, and omit the lookahead, making the split very easy: 在这种情况下,您甚至可以在每个大写字母前插入标记,并省略前瞻,使分割变得非常容易:
"thisIs12MyString".replace(/\d+|[A-Z]/g, '~$&').split('~')
It might or might not perform better. 它可能会或可能不会更好。
In my rhino console, 在我的犀牛控制台中,
js> "thisIs12MyString".replace(/([A-Z]|\d+)/g, function(x){return " "+x;}).split(/ /);
this,Is,12,My,String
another one, 另一个,
js> "thisIs12MyString".split(/(?:([A-Z]+[a-z]+))/g).filter(function(a){return a;});
this,Is,12,My,String
I can't think of any ways to achieve this with a RegEx. 我想不出用RegEx实现这一目标的任何方法。
I think you will need to do it in code. 我想你需要在代码中做到这一点。
Please check the URL, same question different language (ruby) -> 请检查URL,同一问题不同语言(ruby) - >
The code is at the bottom: http://code.activestate.com/recipes/440698-split-string-on-capitalizeduppercase-char/ 代码位于底部: http : //code.activestate.com/recipes/440698-split-string-on-capitalizeduppercase-char/
You can fix the JS missing of lookbehinds working on the array split using your current regex. 您可以使用当前正则表达式修复JS缺少使用当前正则表达式进行数组拆分的外观。
Quick pseudo code: 快速伪代码:
var result = [];
var digitsFlag = false;
"thisIs12MyString".split(/(?=[A-Z0-9])/).forEach(function(word) {
if (isSingleDigit(word)) {
if (!digitsFlag) {
result.push(word);
} else {
result[result.length - 1] += word;
}
digitsFlag = true;
} else {
result.push(word);
digitsFlag = false;
}
});
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.