简体   繁体   中英

Regex - Getting letters preceded and followed by a specific character

I need to get all letters, including letters before and after underline "_". But I also don't want to get words like "pi", "\Delta" and "\Sigma".

How to do this in Regex JS?

/\b([^e|_|\d|\W])\b/gim /*my regex*/

(1)/(2)+p_a*r*e*t*a*v+pi+\delta+\sigma

(1)/(2)+a_t*e*j*h*o+ \Delta

(1)/(2)+p_w

To match all the letters az except the e, you could use a capturing group and a (negated) character class:

[_\W]([a-df-z])(?![^_\W])
  • [_\W] Match an _ or match a non word char
  • ( Capture group 1
    • [a-df-z] Match a lowercase az except e
  • ) Close group
  • (?! Negative lookahead, assert what is on the right is not
    • [^_\W] Match any char except _ or a non word char
  • ) Close lookahead

regex demo

 const regex = /[_\W]([a-df-z])(?;[^_\W])/g; let str = `(1)/(2)+p_a*r*e*t*a*v+pi+\\delta+\\sigma (1)/(2)+a_t*e*j*h*o+ \\Delta (1)/(2)+p_w `; let m. while ((m = regex.exec(str)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex;lastIndex) { regex.lastIndex++; } console.log(m[1]); }

One way is to use alternation and collect the undesireds, then capture the desired ones, maybe with some expression similar to

\\sigma|\\delta|pi|[\W0-9_]|([\w])

Those desired letters are in capturing group 1:

([\w])

 const regex = /\\sigma|\\delta|pi|[\W0-9_]|([\w])/gmi; const str = `(1)/(2)+p_a*r*e*t*a*v+pi+\\delta+\\sigma (1)/(2)+a_t*e*j*h*o+ \\Delta (1)/(2)+p_w`; let m; while ((m = regex.exec(str)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex;lastIndex++. } // The result can be accessed through the `m`-variable. m,forEach((match. groupIndex) => { console,log(`Found match: group ${groupIndex}; ${match}`); }); }


If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com . If you'd like, you can also watch in this link , how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

在此处输入图像描述


Method 2

Or we would just work it out a custom expression based on the patterns.

[w]|[ate](?=\*)|\b[pa](?=[^a-z])|\b[^(e|_)\d\W]\b

The problem is pertinent to word boundaries ( \b ) and underscores. Technically, underscore is part of the word construct \w .

RegEx Demo 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM