简体   繁体   中英

Extract words using regex

I have a string query and I want to match and extract specif words. when I have this word word.EXACT extract the container of this word

like MESH.EXACT("blood glucose monitoring") extract the word "blood glucose monitoring"

  • words.exact("N-Words") -> result "N-Words"
  • words.exact(N-Words) -> result N-Words

Query_Input= (EMB.EXACT("insulin treatment")) and (MESH.EXACT("blood glucose monitoring")) OR "Self-Monitoring of Blood Glucose”

the output needs to be like that

Query_out= "insulin treatment" "blood glucose monitoring" "Self-Monitoring of Blood Glucose”

this Demo has my regexp and my regex : https://regex101.com/r/rqpmXr/15

You could do:

(?<=\w\.EXACT\()[^)]+

see the regex demo . Match any char that is not a closing parenthesis, [^)]+ , only when preceded by \\w\\.EXACT( .

If you want a substitution you could capture the above match and use \\1 (note the trailing space) for the repacement:

.*(?<=\w\.EXACT\()([^)]+).*\n|.*

as shown here: https://regex101.com/r/BS3nwr/4

Edit: As was brought to my attention in one of the comments, look-behinds ( ?<= ) are not supported in some web browsers so you could use (note this regex is slower (requires more steps) than the previous one):

\w+\.EXACT\(([^)]+).*\n|.*?

You may use

/\w+\.EXACT\(([^)]*)\)/g

and replace with $1 , placeholder holding Group 1 value. See the regex demo .

Pattern details

  • \\w+ - 1 or more word chars
  • \\.EXACT\\( - a literal .EXACT( substring
  • ([^)]*) - Group 1: any 0+ chars other than ) (you may use [^()]* in case you need to make sure you are staying within 1 set of (...) )
  • \\) - a ) char.

See the JS demo:

 var s = 'MESH.EXACT("blood glucose monitoring") words tt.EXACT("blood glucose monitoring") '; var rx = /\\w+\\.EXACT\\(([^)]*)\\)/g; document.querySelector("#result").innerHTML = s.replace(rx, "$1"); 
 <div id="result" /> 

Hi I believe this would work :

.EXACT\\((.*?)\\)

Working Example : https://regex101.com/r/uQj2vv/2/

Here is an executable Javascript example which extracts the output you specified from the input you specified:

 let input = "(EMB.EXACT(\\"insulin treatment\\")) and (MESH.EXACT(\\"blood glucose monitoring\\")) OR \\"Self-Monitoring of Blood Glucose\\""; let re = /(?:EXACT\\(("[^"]+")\\)|OR\\s*("[^"]+"))/g; let Query_out = []; while ((match = re.exec(input)) !== null) { Query_out.push(match[1] ? match[1] : match[2]); } console.log(Query_out.join(" ")); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM