简体   繁体   中英

regex. how to exclude a substring from the result after it matches the pattern

I wonder how i can exclude a substring from the result after it matches the pattern. example:

<a href="?page1"><?php __('string1');?></a>
<a href="?page2"><?php __("string2");?></a>

I want to get only the strings passed as parameters to the __() function. i tried this regex:

'/__\(((\'([^\']+)\')|(\"([^\"]+)\"))/'

but that returns 'string1' and "string2" wrapped in single quotes and double quotations.
how can i exclude single quotes and double quotations?

  • Use (?: ) appropriately. These are for grouping that you don't want to capture.
  • If you have the quotations inside the capture ( ) , then the quotes will be included in the capture. If you put the quotes outside, then they will not be included.
  • You have more ( ) than you need. | has the least priority in association.
  • You are escaping more than you need. Quotations don't need to be escapted.
  • Since you are using [^'] and [^"] , you don't have to specify for close quotes/parens.

A fix would be like:

'/__\((?:'([^']+)|"([^"]+))/'

您可以使用Lookahead和Lookbehind或将引号内的字符串分组。

Try this

'/__\(('|")([^\1]+)\1\)/'
       ^1^  ^^2^^^

You can see it online here on Regexr

Every time when you open a round bracket you create a capturing group. So, if you don't want it use (?:) this would define a non capturing group. I don't use this here. I rewrote your regex a bit. In my first group I check if there are ' or " and store them into group 1. later on I use the backreference \\1 to this group one, to use the correct character.

Your result is then stored always into group 2. How you access this result depends on your used language.

您想尝试使用非捕获组- (?:ABC)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM