简体   繁体   中英

How to properly exclude group in regex?

I need to match in some text some pattern, but this pattern should not have another pattern. I use in html some groups and html page does not add new line. Rather than new line in html added
so I get trouble here.

I try to use this regex:

/\|([^\r\n|]+?(?!<br>))\|/igm

and example is:

test1 | test2 | test3<br>| test4<br>| test5 |<br>test6

Should be matching only | test2 | | test2 | and group test2 , but right now also matching | test4<br>| | test4<br>| and not right | test5 | | test5 | . I need to exclude test4 match, but don't know how to use it with [] because it ignored (?!<br>) .

PS of course | test2 | | test2 | also may be | text1 <span ...>text2</span> text3 | | text1 <span ...>text2</span> text3 | , so placing <> into [] is not a solution I need.

The regex you need should be based on a tempered greedy token :

/\|((?:(?!<br\s*\/?>)[^\r\n|])*)\|/gi
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^

See the regex demo

The token is (?:(?!<br\\s*\\/?>)[^\\r\\n|])* and it matches any character other than a CR/LF/ | (the [^\\r\\n|] negated character class accounts for that) that is not starting a <br> tag sequence (or <br > or <br/> or <br /> , etc.) The contents matched with the token are captured into group #1 since it is wrapped with a capturing parentheses (...) .

JS demo:

 var re = /\\|((?:(?!<br\\s*\\/?>)[^\\r\\n|])*)\\|/ig; var str = 'test1 | test2 | test3<br>| test4<br>| test5 |<br>test6|'; var res = []; while ((m = re.exec(str)) !== null) { res.push(m[1]); // Grab Group 1 value only } console.log(res); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM