I have been sifting to very similar questions but I am still stumped. I need to split a string by any non alphanumeric character and keep the delimiters except for parts of the string in double quotes. Hence, for:
string = 'let a = 5 * (other) if x is "constant";'
re.split(pattern, "string")
should yield:
['let', 'a', '=', '5', '*', '(', 'other', '),' 'if', 'x' 'is', '"constant"', ';']
I am getting pretty close with:
re.split(r"(\W)", fragment)
(except for whitespace that I filter out separately) but I cannot manage the double quotes.
Any help appreciated.
You can use
import re
s = 'let a = 5 * (other) if x is "constant";'
print( re.findall(r'"[^"]*"|\w+|[^\w\s]', s) )
See the Python demo and the regex demo .
Details :
"[^"]*"
- a "
, zero or more chars other than "
and then a "
|
- or \w+
- one or more word chars |
- or [^\w\s]
- a char other than a word and whitespace char. re.split(r'[ ]|(?<=[(])|(?=[);])', string)
['let', 'a', '=', '5', '*', '(', 'other', ')', 'if', 'x', 'is', '"constant"', ';']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.