简体   繁体   中英

Regex select text outside two strings and .replace on the selection

I have a string that looks like the following:

"This is a test [Text that (cannot) be changed]. But (this) can be changed."

I want to replace the strings inside ( and ) with html but not when they are inside [ ] . I want to replace all text within [ ] with a different html. My final result would look like the following.

"This is a test <p>Text that (cannot) be changed</p>". But <b>this</b> can be changed."

I created an expression that could select everything outside the [ ] strings. But how can I perform replace to this selected text only? To select everything outside [ ] I use this:

([^\[\]]+)(?:\s|$|\[)

This selects all text outside [ and ] . I want to perform regex replace for ( ) on this selected text only.

You might combine a regex and a callback function to replace the stuff you want:

var subject = 'This (is) a test [Then some text that (cannot) (be) changed]. But (this) (can) be changed.';
var regex = /(?:^|])([^\[]*)(?:\n|$|\[)/g;

var replace = subject.replace(regex, function(match, p1)
{
    return match.replace(/\(/g, '<b>').replace(/\)/g, '</b>');
});

console.log(replace);
// This <b>is</b> a test [Then some text that (cannot) (be) changed]. But <b>this</b> <b>can</b> be changed.

Demo: http://jsfiddle.net/q21sns3s/2/

Regex explanation:

(?:^|]) : we need the beginning of the subject or a closing ]

([^\\[]*) : followed by anything but an opening [

(?:\\n|$|\\[) : ended by an opening [ , a new line or the end of the subject ( $ )

Best approach here is explained in this SO answer where you use a don't catch this|(do catch this) technique. My regex is this:

\[[^\]]*]|\(([^)]*)\)

正则表达式可视化

Debuggex Demo

So I catch everything between [] as well as everything between () , but only the latter generates a capture-group with the text you wanna keep. I can then examine this capture-group to decide what to do: return it unchanged or put <b></b> around it.

 var subject = 'This (is) a test [Then some text that (cannot) (be) changed]. But (this) can (be) changed.'; var regex = /\\[[^\\]]*]|\\(([^)]*)\\)/g; var replace = subject.replace(regex, function(match, p1) { return (p1==undefined)?match:'<b>'+p1+'</b>'; }); console.log(replace); // This <b>is</b> a test [Then some text that (cannot) (be) changed]. But <b>this</b> can <b>be</b> changed. 

(credit to @johansatge for the nice template, I just changed the regex and the return line)

Using /[(][az]+[)]/g on the text you have extracted will allow you to replace the text "(this)"

var newText = myExtractedText.replace(/[(][a-z]+[)]/g, "(new text)"); 

EDIT:

To replace the text from the string initially (With out extracting the stuff inside the '[]' first, you can do:

var s = "This is a test [Text that (cannot) be changed]. But (this) can be changed.",
    match = s.match(/[a-z ]+([(][a-z]+[)])[a-z .]+$/ig)[0];

console.log(match.replace(/[(][a-z]+[)]/, '(new text)'));

You could do sth. like this to capture only (..) which are not inside []. But Javascript lacks the lookbehind feature.

(?!\[)\(.*?\)(?<!\])

You could mimic this feature like described in here . However it think the answer of @funkwurm seem's much cleaner. It's the best way to go for a problem like this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM