I need to find two strings next to a keyword. Here is an example string
\plottwo{image1}{image2}
the keyword is \\plottwo and the correct result is
[image1, image2]
I know that if there is only one string next to the keyword I can use
re.findall('\plottwo.*?{(.*?)},text)
how can I extend this to two strings?
Note that this matches exactly two image strings:
import re
matcher = re.compile(r"""\\plottwo # The literal \plottwo
{ # Opening brace for the first image group
( # Start the first capture group
[^}]+ # Match anything OTHER than a closing brace
) # End the first capture group
} # Closing brace
{ # Opening brace for the second image group
( # Start the second capture group
[^}]+ # Match anything OTHER than a closing brace
) # End the second capture group
} # Closing brace
""", re.VERBOSE)
print matcher.findall('\\plottwo{image1}{image2}')
If you wish to capture either one or two image strings, make one of the capture groups optional:
import re
matcher = re.compile(r"""\\plottwo # The literal \plottwo
{ # Opening brace for the first image group
( # Start the first capture group
[^}]+ # Match anything OTHER than a closing brace
) # End the first capture group
} # Closing brace
(?: # Non-saving group that we can make optional
{ # Opening brace for the second image group
( # Start the second capture group
[^}]+ # Match anything OTHER than a closing brace
) # End the second capture group
} # Closing brace
)? # End the non-capturing group
""", re.VERBOSE)
print matcher.findall('\\plottwo{image1}{image2}')
print matcher.findall('\\plottwo{image2}')
But to echo one of the comments, regex is typically not the best way to do complex parsing jobs (and sometimes even simple parsing jobs :-).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.