简体   繁体   中英

Regular expression to make non-greedy

I have a text like this

EXPRESS      blood| muscle| testis| normal| tumor| fetus| adult
RESTR_EXPR   soft tissue/muscle tissue tumor

Right now I want to only extract the last item in EXPRESS line, which is adult .

My pattern is:

[|](.*?)\n

The code goes greedy to muscle| testis| normal| tumor| fetus| adult muscle| testis| normal| tumor| fetus| adult muscle| testis| normal| tumor| fetus| adult . Can I know if there is any way to solve this issue?

You can take the capture group value exclude matching pipe chars after matching a pipe char followed by optional spaces.

If there has to be a newline at the end of the string:

\|[^\S\n]*([^|\n]*)\n

Explanation

  • \| Match |
  • [^\S\n]* Match optional whitespace chars without newlines
  • ( Capture group 1
    • [^|\n]* Match optional chars except for | or a newline
  • ) Close group 1
  • \n Match a newline

Regex demo

Or asserting the end of the string:

\|[^\S\n]*([^|\n]*)$

You could use this one. It spares you the space before, handle the \r\n case and is non-greedy:

\|\s*([^\|])*?\r?\n

Tested here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM