regex split with non-capturing groups

Question

I want to match for html tagnames (eg. div in < div > ), and then split the string at the position of the match.

 var str = '&lt;div&gt; div'; var regex = /(?:&lt;)(\\w*)(?=&gt;)?/g; var arr = str.split(regex); console.log(arr); //result: ["", "div", "&gt; div"] //expected: ["&lt;", "&gt; div"]

However, the "&lt ;" gets lost by doing this, and also I want the div inside of the < and > removed. How can I achieve it?

This one also doesn't work, because then the "fake-div" at the end of the string would also be splitted, even though it is not within < and >:

 var str = '&lt;div&gt; div'; var regex = /(?:&lt;)(\\w*)(?=&gt;)?/g; var match = regex.exec(str); var arr = match.input.split(match[1]); console.log(arr); //result: ["&lt;", "&gt; ", ""] //expected: ["&lt;", "&gt; div"]

Answer 1

One of the closest you might get if you want to only use a single regex is:

var regex = /\b(?:\w+)(?=&gt;)/gi;
'&lt;div&gt; div'.split(regex);//["&lt;", "&gt; div"]

It gives the expected behavior but the obvious problem with this one is that it does not check preceding < . And javascript does not natively support lookbehind.

A better approach might be to separate < and > and then combine them:

var str = '&lt;div&gt; div';
var ltRgx = /(?:\s|\b|^)(?=&lt)/gi;
var gtRgx = /\b(?:\w+)(?=&gt;)/gi;
var result = str.split(ltRgx).map(function(d,i){
    return d.split(gtRgx)
}).reduce(function(ac,d){
    return ac.concat(d);
});
console.log(result);//["&lt;", "&gt; div"]
/*Another example*/
str = '&lt;div&gt; &lt;img&gt; div';
result = str.split(ltRgx).map(function(d,i){
    return d.split(gtRgx)
}).reduce(function(ac,d){
    return ac.concat(d);
});
console.log(result);//["&lt;", "&gt;", "&lt;", "&gt; div"]

regex split with non-capturing groups

Question

1 answers

solution1
0 2018-04-19 23:08:33

regex split with non-capturing groups

Question

1 answers

solution1 0 2018-04-19 23:08:33

solution1
0 2018-04-19 23:08:33