AttributeError: 'NoneType' object has no attribute 'group' while using re.match

Question

I need to compare the first element of two different files after a certain phrase. So far I have this:

import re

data1 = ""
data2 = ""
first = re.match(r".*Ignore until after this:(?P<data1>.*)", firstlist[0])
second = re.match(r".*Ignore until after this:(?P<data2>.*)", secondarray[0])
data1 = first.group('data1')
data2 = second.group('data2')

if data1 == data2:
  #rest of the code...

I want to ignore everything up to a certain point, and then save the rest into the variable. I do something almost identical to this earlier in the script and it works. However, when I run this, I get this error:

File "myfile.py", line [whatever line it is], in <module>  
data1 = first.group('data1')  
AttributeError: 'NoneType' object has no attribute 'group'

Why isn't re.match isn't working properly with first and second?

EDIT

As per suggestion, I've changed [\\s\\S]* to .* .

EDIT 2: This is what the input looks like (NOT like in the comment below):

Random text

More random text

Even more random text

Ignore until after this:

Meaningful text, keep this

...and everything else...

...until the end of the file here

That's really basically all it is: a string of text that needs to be saved from after a certain point

Answer 1

You're probably just having issues because of the newlines in your file. As Martijn Pieters pointed out in the comments to your questions, you can use the flag re.DOTALL to capture everything. So with a file like so, (named tmp in this example)

Random text

More random text

Even more random text

Ignore until after this:

Meaningful text, keep this

...and everything else...

...until the end of the file here

You could do something like this

with open('tmp') as f:
  first = re.match(r'.*Ignore until after this:(?P<data1>.*)', f.read(), re.DOTALL)
  print(first.group('data1'))

which gives

Meaningful text, keep this

...and everything else...

...until the end of the file here

Answer 2

The dot '.' character in regular expressions matches any character except a newline. So if you have your entire file as a single string, then the regular expression is matching up to the first new line, then trying to match your phrase against the start of the next line. When this fails, it returns a NoneType.

See this and this .

AttributeError: 'NoneType' object has no attribute 'group' while using re.match

Question

EDIT

2 answers

solution1
3 ACCPTED 2013-09-23 20:52:21

solution2
0 2013-09-23 20:30:58

AttributeError: 'NoneType' object has no attribute 'group' while using re.match

Question

EDIT

2 answers

solution1 3 ACCPTED 2013-09-23 20:52:21

solution2 0 2013-09-23 20:30:58

solution1
3 ACCPTED 2013-09-23 20:52:21

solution2
0 2013-09-23 20:30:58