简体   繁体   中英

Python regex to find specific word in the middle of a text file

I have a text file basically and I want to search for the middle word of a sentence. I get an error saying found_state not defined when I run my .py script.

Consider this file:

file.conf
hostname(config)#aaa new-model
fdfsfd b
kthik
pooooo
shh

My python script looks like:

import re;    
import time;

with open('file.conf') as f:
    content = f.readlines()
name=''

for data in content:
    if re.search('(?<=#)\w+',data):
        found_state=1
        name=data
        break
if found_state==1:
    print name + "is Found"
else:
    print "NF"

If your condition if re.search('(?<=#)\\w+',data): fails, then found_state is not declared. Do that before the for loop.

Since you say you need to get the "middle word" I understand you need to extract that word. Right now, you get the whole line if there is a match.

Here is a piece of code that should work for you (it prints aaa is Found ):

import re;
content = ["hostname(config)#aaa new-model", "fdfsfd b", "kthik", "pooooo", "shh"] # <= TEST DATA
name=''
found_state = 0                       # Declare found_state
for data in content:
    m = re.search(r'#(\w+)',data)     # Use a raw string literal and a capturing group
    if m:                             # Check if there was a match and if yes
        found_state=1                 #   - increment found_state
        name=m.group(1)               #   - get the word after #
        break
if found_state==1:
    print name + " is Found"
else:
    print "NF"

However, perhaps, you'd want to reduce your code to

res = []
for data in content:
    res.extend(re.findall(r'#(\w+)', data))
print(res)

See this demo . The #(\\w+) pattern will capture word chars (1 or more) after a # , and will only return these captured substrings and extend will add all them to the list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM