简体   繁体   中英

Getting text between quotes using regular expression

I'm having some issues with a regular expression I'm creating.

I need a regex to match against the following examples and then sub match on the first quoted string:

Input strings

("Lorem ipsum dolor sit amet, consectetur adipiscing elit.")

('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ')

('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ', 'arg1', "arg2")

Must sub match

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Regex so far:

\\((["'])([^"']+)\\1,?.*\\)

The regex does a sub match on the text between the first set of quotes and returns the sub match displayed above.

This is almost working perfectly, but the problem I have is that if the quoted string contains quotes in the text the sub match stops at the first instance, see below:

Failing input strings

("Lorem ipsum dolor \\"sit\\" amet, consectetur adipiscing elit.")

Only sub matches: Lorem ipsum dolor

("Lorem ipsum dolor 'sit' amet, consectetur adipiscing elit.")

The entire match fails.

Notes

The input strings are actually php code function calls. I'm writing a script that will scan .php source files for a specific function and grab the text from the first parameter.

Try this regular expression:

\(\s*(?:"(?:[^"\\]+|\\.)*"|'(?:[^'\\]+|\\.)*')(?:\s*,\s*(?:"(?:[^"\\]+|\\.)*"|'(?:[^'\\]+|\\.)*'))*\s*\)

Some explanation:

  • \\(\\s\\* matches the opening parenthesis and optional whitespace.
  • (?:"(?:[^"\\\\]+|\\\\.)*"|'(?:[^'\\\\]+|\\\\.)*') is to match any quoted string allowing the quote character only when escaped with \\ .
  • (?:\\s*,\\s*(?:"(?:[^"\\\\]+|\\\\.)*"|'(?:[^'\\\\]+|\\\\.)*'))* describes zero or more quotes strings, preceded by a , that may be preceded and followed by whitespace.
  • \\s*\\) matches the closing parenthesis with optional whitespace.

确保在转义时不匹配引号(在引号前加反斜杠):

/\((["'])([^"']+)[^\\]\1,?.*?\)/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM