re.search with \\s or '\\n' is not finding the multiline i'm trying to search for.
Portion of Source:
Date/Time:
2013-08-27 17:05:36
----- BEGIN SEARCH -----
GENERAL DATA:
NAME: AB12
SECTOR:
999,999
CONTROLLED BY: Player
ALLIANCE: Aliance
ONLINE: 1 seconds ago
SIZE: Large
HOMEWORLD: NO
APPROVAL RATING: 100%
PRODUCTION RATE: 100%
RESOURCE DATA:
POWER: 0 / 0
BUILDINGS: 0 / 20
ORE: 80,000 / 80,000
CRYSTAL: 80,000 / 80,000
POPULATION: 40,000 / 40,000
BUILDING DATA:
N/A
UNIT DATA:
WYVERN(S): 100
----- END SEARCH -----
Looking at it in Notepad++ I see "BUILDING DATA:(LF)"
Full Code
lines = open('scan.txt','r').readlines()
for a in lines:
if re.search(r"\A\d", a):
digits = a
if re.search(r"2013", digits):
date.append(digits[:19])
count +=1
elif re.search(r",", digits):
clean = digits.rstrip()
sector = clean.split(',')
x.append(sector[0])
y.append(sector[1])
elif re.search(r"CONTROLLED BY:", a):
player.append(a[15:].rstrip())
elif re.search(r"ALLIANCE:", a):
alliance.append(a[10:].rstrip())
elif re.search(r"SIZE:", a):
size.append(a[6:].rstrip())
elif re.findall('BUILDING DATA:\sN/A', a, re.M):
def_grid = ''
print "Didn't find it"
defense.append(def_grid)
defense_count +=1
elif re.search(r"DEFENSE GRID", a):
def_grid = a[16:].rstrip()
print "defense found"
defense_count +=1
But I am not having anything returned.
I need to put an empty spacer in when "DEFENSE GRID" doesn't exist after "BUILDING DATA:"
I know i'm missing something and I've tried reading up on re.search but i'm not able to find any thorough examples that explain how the multiline works.
re.findall("BUILDING DATA:\nN/A",a,re.MULTILINE)
You can do just what you did, but using re.findall
instead of re.search
:
re.findall('BUILDING DATA:\nN/A', a, re.M)
#['BUILDING DATA:\nN/A']
EDIT:
The problem is that you are currently reading line-by-line. In order to detect a pattern that belongs to two or more lines, you have to consider the string as a whole, maybe doing:
s = ''.join(lines)
which is ok if lines
is not so big, and then use s
to perform your multi-line searches...
I wonder why you have nothing returned. If your file looks like this:
BUILDING DATA:
N/A
I get using
import re
f = open('test.txt','r')
a = f.read(20)
re.search('BUILDING DATA:\nN/A', a, re.M)
an output. This is
<_sre.SRE_Match object at 0x1004fc8b8>
If I test re.search with string, that is not in the file like in this code:
import re
f = open('test.txt','r')
a = f.read(20)
re.search('BUILDING BATA:\nN/A', a, re.M)
there is no output as expected.
EDIT:
As Saullo Castro pointed out, the problem is the line-by-line reading. Why not use something like this?
a = open('scan.txt','r').read()
if re.findall('BUILDING DATA:\nN/A', a, re.M):
print('found!')
3rd try:
tmp = False
...
elif re.findall('BUILDING DATA:', a, re.M):
tmp = True
elif tmp and re.findall('N/A', a, re.M):
def_grid = ''
print "Didn't find it"
defense.append(def_grid)
defense_count +=1
Replace
re.findall('BUILDING DATA:\sN/A', a, re.M):
with
re.findall('BUILDING DATA:\nN/A', a, re.M):
or
re.search(r'BUILDING DATA:\nN/A', a, re.M):
and it should work.
(Notice that in your code there's \\s
instead of \\n
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.