matching multiline using re.findall in python

Question

In a file Myfile: the content is as follows:

  username: prpadiya
  url: https://bhgerrit.ext.net.nokia.com:443/154588
  commitMessage: Handling for change in the path of cm library to /etc/opt/nokia/CMNMS/plugins.
                 NATP
  createdOn: 2020-05-22 12:52:52 IST

I need to match starting with "commitMessage" till any number of lines in the commit message. In the above file there is extra one line which is ending with "NATP". I used re.DOTALL but still no luck. can any one help me? My code is as follows:

for line in myfile:
    if re.findall("^commitMessage:\s.*[\r\n].*", line, re.DOTALL):
        print("Line is ::", line)
        msg = line.split('commitMessage:')[-1]
        print("Msg is ::", msg)
        break

Answer 1

You should read the file as a whole single string rather than reading it line by line if you need to match some pattern across lines. Then, you need to make sure there are horizontal whitespaces at the beginning of each subsequent line after commitMessage .

Use myfile.read() and

(?m)^commitMessage:.*(?:\n[^\S\n].*)*

See the regex demo . Details:

(?m)^ - start of a line ( (?m) is the inline modifier that does what re.M does, makes ^ match start of a line and $ the end of a line)
commitMessage: - a string
.* - the rest of the line, .* matches any 0 or more chars other than line break chars, as many as possible
(?:\n[^\S\n].*)* - any 0 or more repetitions of:
- \n - a newline
- [^\S\n] - a whitespace other than LF
- .* - any 0 or more chars other than line break chars, as many as possible

Python:

with open(fpath, 'r') as myfile:
  print( re.findall(r'(?m)^commitMessage:.*(?:\n[^\S\n].*)*', myfile.read()) )

matching multiline using re.findall in python

Question

1 answers

solution1
0 2020-05-22 17:30:40

matching multiline using re.findall in python

Question

1 answers

solution1 0 2020-05-22 17:30:40

solution1
0 2020-05-22 17:30:40