简体   繁体   中英

finding a word on the following line python

I'm searching through a text file for a certain string then looking to find another string following that string, it could be on the next line or further down the document. I currently have

so an example text output would like

there is a word1. then there is some more text. 
then we are looking for word2 = apple. 

i'm looking to return the word 'apple' + word1. However word2= can be on the next line or further down the document. i've managed to do the below but this only works if its on the next line. not if it was on line 3,4, 5 etc. can anyone help?

if 'word1' in line and 'word2' not in line:        
    nextLine = next(f)
    pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine) 
    if pattern:    
        print('word1', pattern)

If I get it right, I made an example for you:

string = """

there is a word1. then there is some more text. 
then we are looking for word2 = apple. 


there is a word1. then there is some more text. 
then we are looking for word2 = orange. 



there is a word1. then there is some more text. 
then there is some more text. 
then there is some more text. 
then we are looking for word2= peer. 
"""


import re
result = re.findall(".*?(word1)[\s\S]*?word2 *=.*?([a-z0-9_]+)", string)
print(result)
# should be [('word1', 'apple'), ('word1', 'orange'), ('word1', 'peer')]

Note: As I am using the whole string to match, my example may not be suitable for big size file.

if 'word1' in line and 'word2' not in line: 
while True:       
    nextLine = next(f)
    pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine) 
    if pattern:    
        print('word1', pattern)
        break

Not sure it will work dont have access to PC let me know, if not working I'll delete it

beware tough:

Are all infinite loops bad?

Is while (true) with break bad programming practice?

You should read your complete file in one string, and then try this. This will capture word1, and whatever equates to word2 using capturing groups :

(word1)(?:.*[\n\r]?)+word2 ?= ?(\w+)

It is not clear from your question whether we should match word2 = apple or word2=apple (maybe the last time you mentioned word2= it was a typo?), so I included the ? character, which will make the spaces optional.

If you want your answer in the format apple + word1 , you can do:

print(pattern.group(1) + " + " + pattern.group(2))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM