简体   繁体   English

拆分字符串,包括正则表达式匹配

[英]Split string including regular expression match

I am parsing some text with JavaScript. 我正在用JavaScript解析一些文本。 Let's say I have some string: 假设我有一些字符串:

"hello wold <1> this is some random text <3> foo <12>"

I need to place the following sub strings in an array: 我需要将以下子字符串放置在数组中:

myArray[0] = "hello world ";
myArray[1] = "<1>";
myArray[2] = " this is some random text ";
myArray[3] = "<3>";
myArray[4] = " foo ";
myArray[5] = "<12>";

Note that I am spliting the string whenever I encounter a <"number"> sequence 请注意,每当遇到<“ number”>序列时,我都会拆分字符串

I have tried spliting the string with a regular expresion /<\\d{1,3}>/ but when I do so I loose the <"number"> sequence. 我曾尝试使用常规表达式/<\\d{1,3}>/拆分字符串,但是这样做时,我松了<“ number”>序列。 In other words I end up with "hellow world", " this is some random text ", " foo ". 换句话说,我最终得到“ hellow world”,“这是一些随机文本”,“ foo”。 Note that I loose the strings "<1>", "<3>" and "<12>" I will like to keep that. 请注意,我希望保留字符串“ <1>”,“ <3>”和“ <12>”。 How will I be able to solve this? 我将如何解决这个问题?

You need to capture the sequence to retain it. 您需要捕获序列以保留它。

var str = "hello wold <1> this is some random text <3> foo <12>"

str.split(/(<\d{1,3}>)/);

// ["hello wold ", "<1>", " this is some random text ", "<3>", " foo ", "<12>", ""]

In case there are issues with the capturing group in some browsers, you could do it manually like this: 如果某些浏览器中的捕获组存在问题,则可以这样手动进行:

var str = "hello wold <1> this is some random text <3> foo <12>",    
    re = /<\d{1,3}>/g,
    result = [],
    match,
    last_idx = 0;

while( match = re.exec( str ) ) {
   result.push( str.slice( last_idx, re.lastIndex - match[0].length ), match[0] );

   last_idx = re.lastIndex;
}
result.push( str.slice( last_idx ) );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM