I need to compare the first element of two different files after a certain phrase. So far I have this:
import re
data1 = ""
data2 = ""
first = re.match(r".*Ignore until after this:(?P<data1>.*)", firstlist[0])
second = re.match(r".*Ignore until after this:(?P<data2>.*)", secondarray[0])
data1 = first.group('data1')
data2 = second.group('data2')
if data1 == data2:
#rest of the code...
I want to ignore everything up to a certain point, and then save the rest into the variable. I do something almost identical to this earlier in the script and it works. However, when I run this, I get this error:
File "myfile.py", line [whatever line it is], in <module>
data1 = first.group('data1')
AttributeError: 'NoneType' object has no attribute 'group'
Why isn't re.match
isn't working properly with first and second?
As per suggestion, I've changed [\\s\\S]*
to .*
.
EDIT 2: This is what the input looks like (NOT like in the comment below):
Random text
More random text
Even more random text
Ignore until after this:
Meaningful text, keep this
...and everything else...
...until the end of the file here
That's really basically all it is: a string of text that needs to be saved from after a certain point
You're probably just having issues because of the newlines in your file. As Martijn Pieters pointed out in the comments to your questions, you can use the flag re.DOTALL to capture everything. So with a file like so, (named tmp
in this example)
Random text
More random text
Even more random text
Ignore until after this:
Meaningful text, keep this
...and everything else...
...until the end of the file here
You could do something like this
with open('tmp') as f:
first = re.match(r'.*Ignore until after this:(?P<data1>.*)', f.read(), re.DOTALL)
print(first.group('data1'))
which gives
Meaningful text, keep this
...and everything else...
...until the end of the file here
The dot '.' character in regular expressions matches any character except a newline. So if you have your entire file as a single string, then the regular expression is matching up to the first new line, then trying to match your phrase against the start of the next line. When this fails, it returns a NoneType.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.