简体   繁体   中英

Javascript flavor regex for identifying valid Python strings enclosed within triple quotes

I'm trying to write a Prettify-style syntax highlighter for Qiskit Terra (which closely follows the Python syntax). Apparently, Prettify uses Javascript flavor regex. For instance, /^\\"(?:[^\\"\\\\]|\\\\[\\s\\S])*(?:\\"|$)/, null, '"' is the regex corresponding to valid strings in Q# . Basically I'm trying to put together the equivalent regex expression for Python.

Now, I know that Python supports strings within triple quotes ie '''<string>''' and """<string>""" are valid strings (this format is especially used for docstrings ). To deal with this case I wrote the corresponding capturing group as:

(^\'{3}(?:[^\\]|\\[\s\S])*(?:\'{3}$))

Here is the regex101 link .

This works okay except in some cases like:

''' 'This "is" my' && "first 'regex' sentence." ''' &&
''' 'This "is" the second.' '''

Clearly here it should have considered ''' 'This "is" my' && "first 'regex' sentence." ''' ''' 'This "is" my' && "first 'regex' sentence." ''' as one string and ''' 'This "is" the second.' ''' ''' 'This "is" the second.' ''' as another. But no, the regex I wrote groups together the whole thing as one string (check the regex101 link ). That is, it doesn't conclude the string even when it encounters a ''' (corresponding to the ''' at the beginning).

How should I modify the regex (^\\'{3}(?:[^\\\\]|\\\\[\\s\\S])*(?:\\'{3}$)) to take into account this case? I'm aware of this: How to match “anything up until this sequence of characters” in a regular expression? but it doesn't quite answer my question, at least not directly.

I Don't know what else you want to use this for but the following regex does what you want with the example given with the MULTILINE flag on.

My_search = re.findall("(?:^\'{3})(.*)(?:\'{3})", My_string, re.MULTILINE)

print(My_search[0])
print(My_search[1])

Output is,

'This "is" my' && "first 'regex' sentence." 
'This "is" the second.' 

You can also see it working here https://regex101.com/r/k4adk2/11

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM