Python regex : Fetch next line after string match

Question

I have been searching this forum for close match of my problem but could not locate suitable solution, so posting the query.

Am using urllib and re modules to extract certain sections of webpage. What is of interest is also the status associated with those sections.

For example, looking at the source of the webpage :

MY-TEXT #1410 finished subtask PREPARE-WORKSPACE #340418: https://cloud6.foo.bar.com/b/job/PREPARE-WORKSPACE/340418

'>SUCCESS

Am using re.compile and re.findall to extract text coming after this pattern " https://cloud6.foo " ; this matches all the text and using this list I have confirmed it is so ; but am loosing out on the status of this particular task because it is in the line immediate after the "https://" line.

How to extract one line after the matched string in the current scenario ?

Code snippet is here :

from urllib import urlopen
import re

webpage = urlopen(urllink).read()
buildPhases = re.compile(r'\<a href=\W{1}https\W{3}(.*)')
phaseLists = re.findall(buildPhases, webpage)
for item in phaseLists:
    print item

Answer 1

To extract a line after matching string you need to add .*\\n in you regex.
For example if we take:

MY-TEXT #1410 finished subtask PREPARE-WORKSPACE #340418: https://cloud6.foo.bar.com/b/job/PREPARE-WORKSPACE/340418

'>SUCCESS

and apply this pattern r'https.*\\n.*\\n.*' the result should be the above string without:

MY-TEXT #1410 finished subtask PREPARE-WORKSPACE #340418:

Python regex : Fetch next line after string match

Question

1 answers

solution1
0 2015-11-12 14:38:15

Python regex : Fetch next line after string match

Question

1 answers

solution1 0 2015-11-12 14:38:15

solution1
0 2015-11-12 14:38:15