Regular expression to capture n lines of text between two regex patterns

Question

Need help with a regular expression to grab exactly n lines of text between two regex matches. For example, I need 17 lines of text and I used the example below, which does not work. I

Please see sample code below:

import re
match_string = re.search(r'^.*MDC_IDC_RAW_MARKER((.*?\r?\n){17})Stored_EGM_Trigger.*\n'), t, re.DOTALL).group()
value1 = re.search(r'value="(\d+)"', match_string).group(1)
value2 = re.search(r'value="(\d+\.\d+)"', match_string).group(1)
print(match_string)
print(value1)
print(value2)

I added a sample string to here, because SO does not allow long code string: https://hastebin.com/aqowusijuc.xml

Answer 1

You are getting false positives because you are using the re.DOTALL flag, which allows the . character to match newline characters. That is, when you are matching ((.*?\r?\n){17}) , the . could eat up many extra newline characters just to satisfy your required count of 17. You also now realize that the \r is superfluous. Also, starting your regex with ^.*?is superfluous because you are forcing the search to start from the beginning but then saying that the search engine should skip as many characters as necessary to find MDC_IDC_RAW_MARKER . So, a simplified and correct regex would be:

match_string = re.search(r'MDC_IDC_RAW_MARKER.*\n((.*\n){17})Stored_EGM_Trigger.*\n', t)

Regex Demo

Regular expression to capture n lines of text between two regex patterns

Question

1 answers

solution1
1 ACCPTED 2020-06-27 11:23:06

Regular expression to capture n lines of text between two regex patterns

Question

1 answers

solution1 1 ACCPTED 2020-06-27 11:23:06

solution1
1 ACCPTED 2020-06-27 11:23:06