Python 3.6 Regex Producing Unexpected Results (despite using string literals)

Question

A short time ago I had an almost identical problem to this one and it was fixed by using string literals instead of literal strings . This time, I took care to use string literals but it didn't fix the problem.

I am trying to extract a section from a string and the results I get from Python are different than what regex101 shows I should be getting. I'm using this

Supersedes:?[\\r\\n ]+(?:[A-Za-z\-0-9])*[\\w\-\\s]+[\\r\n ]+(.*)[\\r\\n ]+Serial Numbers:?

to match this text:

\\r\\n\\r\\nSupersedes\\r\\nNone\\r\\n\\r\\nChanges to VGA-77 gas module assembly (0110444290)\\r\\n\\r\\nService Serial Numbers:\\r\\nUS00000000-US99999999\\r\\n\\r

I'm expecting the first captured group to give me

n\r\nChanges to VGA-77 gas module assembly (0110444290)\r\n\r\nService

https://regex101.com/r/eHdhBV/2

But when I try this in Python:

rx = r'Supersedes:?[\r\n ]+(?:[A-Za-z\-0-9])*[\w\-\s]+[\r\n ]+(.*)[\r\n ]+Serial Numbers:?'
string = '\r\n\r\nSupersedes\r\nNone\r\n\r\nChanges to VGA-77 gas module assembly (0110444290)\r\n\r\nService Serial Numbers:\r\nUS00000000-US99999999\r\n\r'
result = re.search(rx, string, re.M|re.S)
result[1]
'(0110444290)\r\n\r\nService'

The result is not the same as what is shown on regex101. What's causing this?

Answer 1

To solve the current issue, you may use

m = re.search(r'Supersedes:?\s*[^\r\n]*[\r\n]+(.*?)[ \r\n]+Serial Numbers', s, re.S)
if m:
    print(m.group())

See the regex demo online .

Please note that you should use literal strings in online regex testers, that is, convert your \\n and \\r into line breaks.

Pattern details

Supersedes:? - Supersedes: or Supersedes
\\s* - any 0+ whitespaces
[^\\r\\n]* - any 0+ chars other than LF an CR
[\\r\\n]+ - 1+ LR or CR symbols
(.*?) - Group 1: any 0+ chars, as few as possible
[ \\r\\n]+ - 1+ spaces, CR or LF
Serial Numbers - a literal Serial Numbers string.

Python 3.6 Regex Producing Unexpected Results (despite using string literals)

Question

1 answers

solution1
0 ACCPTED 2018-07-18 11:54:42

Python 3.6 Regex Producing Unexpected Results (despite using string literals)

Question

1 answers

solution1 0 ACCPTED 2018-07-18 11:54:42

solution1
0 ACCPTED 2018-07-18 11:54:42