I have to find multi line pattern in python. So I am using DOTALL from regex but It is finding more than what I need.
sample file:
if(condition_1)
{
....
some text
some text
if ((condition_1== condition_2) ||
(condition_3== condition_4) ||
(condition_6== condition_5) ||
(condition_7== condition_8) ) // XYZ_variable
{
...
My python regex follows
re.compile(r'(if\s*?\()(.*?)(\/\/\s*?)(XYZ_variable)', re.DOTALL)
this is finding from first if conditions until XYZ_variable but I need only the second if condition where is XYZ_variable is present.
so I changed my regex as follows which is not working
re.compile(r'(if\s*?\()([^\{].*?)(\/\/\s*?)(XYZ_variable)', re.DOTALL)
My final output shall be like
if(condition_1)
{
....
some text
some text
if (((condition_1== condition_2) ||
(condition_3== condition_4) ||
(condition_6== condition_5) ||
(condition_7== condition_8) ) || XYZ_variable )
{
...
but my regex does something like this
if ((condition_1)
{
....
some text
some text
if ((condition_1== condition_2) ||
(condition_3== condition_4) ||
(condition_6== condition_5) ||
(condition_7== condition_8) ) || XYZ_variable )
{
...
You may use
re.sub(r'(?m)^(\s*if\s*)(\(.*(?:\n(?!\s*if\s*\().*)*)//\s*(\w+)\s*$', r'\1(\2 || \3)', s)
See the regex demo .
Details
(?m)
- re.M
flag ^
- start of a line (\\s*if\\s*)
- Group 1: if
enclosed with 0+ whitespaces (\\(.*(?:\\n(?!\\s*if\\s*\\().*)*)
- Group 2:
\\(
- a (
.*
- the rest of the line (?:\\n(?!\\s*if\\s*\\().*)*
- 0 or more repetitions of
\\n(?!\\s*if\\s*\\()
- a newline, LF, that is not followed with if
enclosed with 0+ whitespaces and then followed with (
.*
- the rest of the line //\\s*
- //
and 0+ whitespaces (\\w+)
- Group 3: 1 or more word chars \\s*$
- 0+ whitespaces and end of line. import re
s = """if(condition_1)
{
....
some text
some text
if ((condition_1== condition_2) ||
(condition_3== condition_4) ||
(condition_6== condition_5) ||
(condition_7== condition_8) ) // XYZ_variable
{
..."""
print( re.sub(r'(?m)^(\s*if\s*)(\(.*(?:\n(?!\s*if\s*\().*)*)//\s*(\w+)\s*$', r'\1(\2 || \3)', s) )
Output:
if(condition_1)
{
....
some text
some text
if (((condition_1== condition_2) ||
(condition_3== condition_4) ||
(condition_6== condition_5) ||
(condition_7== condition_8) ) || XYZ_variable)
{
...
The regular expression captures the first pattern matched. That is why it always takes starting from the first if
.
Consider the following minimal example, where the non-greedy ?
does not modify the output:
>>> re.compile(r"if(.*?)XYZ").search("if a if b if c XYZ").group(1)
' a if b if c '
But there, the non-greedy ?
does modify the output:
>>> re.compile(r"if(.*?)XYZ").search("if a XYZ if b if c XYZ").group(1)
' a '
The non-greedy ?
operates only on the right side of the search.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.