简体   繁体   中英

regular expression to match special characters between delimiters

i have a basic string and would like to get only specific charaters between the brackets

Base string: This is a test string [more or less]

regex: to capture all r's and e's works just fine.

(r|e) 

=> This is at e st st r ing [mo re o r l e ss]

Now i want to use the following regex and group it with my regex to give only r's and e's between the brackets, but unfortunately this doesn't work:

\[(r|e)\]

Expected result should be : mo re o r l e ss

can someone explain?

edit: the problem is very similar to this one: Regular Expression to find a string included between two characters while EXCLUDING the delimiters

but with the difference, that i don't want to get the whole string between the brackets.

Follow up problem

base string = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed'

I need a regex for finding the non-ascii characters äöü in order to replace them but only in the link:...] substring which starts with the word link: and ends with a ] char.

The result string will look like this:

result string = 'this is a link:/en/test/apfel/ohr[MyLink_with_aou] BREAK äöü is now allowed again'

The regex /[äöü]+(?=[^\\]\\[]*])/g from the solution in the comments only delivers the äöü chars between the two brackets.

I know that there is a forward lookahead with a char list in the regex, but i wonder why this one does not work:

/link:([äöü]+(?=[^\\]\\[]*])/

thanks

You can use the following solution: match all between link: and ] , and replace your characters only inside the matched substrings inside a replace callback method:

 var hashmap = {"ä":"a", "ö":"o", "ü":"u"}; var s = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed'; var res = s.replace(/\\blink:[^\\]]*/g, function(m) { // m = link:/en/test/äpfel/öhr[MyLink_with_äöü] return m.replace(/[äöü]/g, function(n) { // n = ä, then ö, then ü, return hashmap[n]; // each time replaced with the hashmap value }); }); console.log(res); 

Pattern details :

  • \\b - a leading word boundary
  • link: - whole word link with a : after it
  • [^\\]]* - zero or more chars other than ] (a [^...] is a negated character class that matches any char/char range(s) but the ones defined inside it).

Also, see Efficiently replace all accented characters in a string?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM