简体   繁体   中英

matching multiline using re.findall in python

In a file Myfile: the content is as follows:

  username: prpadiya
  url: https://bhgerrit.ext.net.nokia.com:443/154588
  commitMessage: Handling for change in the path of cm library to /etc/opt/nokia/CMNMS/plugins.
                 NATP
  createdOn: 2020-05-22 12:52:52 IST

I need to match starting with "commitMessage" till any number of lines in the commit message. In the above file there is extra one line which is ending with "NATP". I used re.DOTALL but still no luck. can any one help me? My code is as follows:

for line in myfile:
    if re.findall("^commitMessage:\s.*[\r\n].*", line, re.DOTALL):
        print("Line is ::", line)
        msg = line.split('commitMessage:')[-1]
        print("Msg is ::", msg)
        break

You should read the file as a whole single string rather than reading it line by line if you need to match some pattern across lines. Then, you need to make sure there are horizontal whitespaces at the beginning of each subsequent line after commitMessage .

Use myfile.read() and

(?m)^commitMessage:.*(?:\n[^\S\n].*)*

See the regex demo . Details:

  • (?m)^ - start of a line ( (?m) is the inline modifier that does what re.M does, makes ^ match start of a line and $ the end of a line)
  • commitMessage: - a string
  • .* - the rest of the line, .* matches any 0 or more chars other than line break chars, as many as possible
  • (?:\n[^\S\n].*)* - any 0 or more repetitions of:
    • \n - a newline
    • [^\S\n] - a whitespace other than LF
    • .* - any 0 or more chars other than line break chars, as many as possible

Python:

with open(fpath, 'r') as myfile:
  print( re.findall(r'(?m)^commitMessage:.*(?:\n[^\S\n].*)*', myfile.read()) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM