[英]read a file and search for a string, if matches, return the next word in python
[英]Read the next word in a file in python
我在python文件中寻找一些单词。 找到每个单词后,我需要从文件中读取接下来的两个单词。 我一直在寻找解决方案,但是我找不到接下来的单词。
# offsetFile - file pointer
# searchTerms - list of words
for line in offsetFile:
for word in searchTerms:
if word in line:
# here get the next two terms after the word
感谢您的时间。
更新:仅需要第一次出现。 在这种情况下,实际上只能出现一个单词。
文件:
accept 42 2820 access 183 3145 accid 1 4589 algebra 153 16272 algem 4 17439 algol 202 6530
字词:['access','algebra']
当我遇到“访问”和“代数”时搜索文件,我分别需要183 3145和153 16272的值。
一种简单的处理方法是使用生成器读取文件,该生成器一次从文件中产生一个单词。
def words(fileobj):
for line in fileobj:
for word in line.split():
yield word
然后找到您感兴趣的单词,然后阅读接下来的两个单词:
with open("offsetfile.txt") as wordfile:
wordgen = words(wordfile)
for word in wordgen:
if word in searchterms: # searchterms should be a set() to make this fast
break
else:
word = None # makes sure word is None if the word wasn't found
foundwords = [word, next(wordgen, None), next(wordgen, None)]
现在, foundwords[0]
是您找到的单词, foundwords[1]
是之后的单词, foundwords[2]
是其后的第二个单词。 如果没有足够的单词,则列表中的一个或多个元素将为None
。
如果您要强制此选项仅在一行内匹配, 则要稍微复杂一点,但是通常您可以避免将文件视为一系列单词。
如果您只需要检索两个单词,请执行以下操作:
offsetFile.readline().split()[:2]
word = '3' #Your word
delim = ',' #Your delim
with open('test_file.txt') as f:
for line in f:
if word in line:
s_line = line.strip().split(delim)
two_words = (s_line[s_line.index(word) + 1],\
s_line[s_line.index(word) + 2])
break
def searchTerm(offsetFile, searchTerms):
# remove any found words from this list; if empty we can exit
searchThese = searchTerms[:]
for line in offsetFile:
words_in_line = line.split()
# Use this list comprehension if always two numbers continue a word.
# Else use words_in_line.
for word in [w for i, w in enumerate(words_in_line) if i % 3 == 0]:
# No more words to search.
if not searchThese:
return
# Search remaining words.
if word in searchThese:
searchThese.remove(word)
i = words_in_line.index(word)
print words_in_line[i:i+3]
对于“访问”,“代数”,我得到以下结果:
['access','183','3145']
['代数','153','16272']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.