在python中读取文件中的下一个单词

Question

我在python文件中寻找一些单词。 找到每个单词后，我需要从文件中读取接下来的两个单词。 我一直在寻找解决方案，但是我找不到接下来的单词。

# offsetFile - file pointer
# searchTerms - list of words

for line in offsetFile:
    for word in searchTerms:
        if word in line:
           # here get the next two terms after the word

感谢您的时间。

更新：仅需要第一次出现。 在这种情况下，实际上只能出现一个单词。

文件：

accept 42 2820 access 183 3145 accid 1 4589 algebra 153 16272 algem 4 17439 algol 202 6530

字词：['access'，'algebra']

当我遇到“访问”和“代数”时搜索文件，我分别需要183 3145和153 16272的值。

Answer 1

一种简单的处理方法是使用生成器读取文件，该生成器一次从文件中产生一个单词。

def words(fileobj):
    for line in fileobj:
        for word in line.split():
            yield word

然后找到您感兴趣的单词，然后阅读接下来的两个单词：

with open("offsetfile.txt") as wordfile:
    wordgen = words(wordfile)
    for word in wordgen:
        if word in searchterms:   # searchterms should be a set() to make this fast
            break
    else:
        word = None               # makes sure word is None if the word wasn't found

    foundwords = [word, next(wordgen, None), next(wordgen, None)]

现在， foundwords[0]是您找到的单词， foundwords[1]是之后的单词， foundwords[2]是其后的第二个单词。 如果没有足够的单词，则列表中的一个或多个元素将为None 。

如果您要强制此选项仅在一行内匹配，则要稍微复杂一点，但是通常您可以避免将文件视为一系列单词。

Answer 2

如果您只需要检索两个单词，请执行以下操作：

offsetFile.readline().split()[:2]

Answer 3

word = '3' #Your word
delim = ',' #Your delim

with open('test_file.txt') as f:
    for line in f:
        if word in line:
            s_line = line.strip().split(delim)
            two_words = (s_line[s_line.index(word) + 1],\
            s_line[s_line.index(word) + 2])
            break

Answer 4

    def searchTerm(offsetFile, searchTerms):
            # remove any found words from this list; if empty we can exit
            searchThese = searchTerms[:]
            for line in offsetFile:
                    words_in_line = line.split()
                    # Use this list comprehension if always two numbers continue a word.
                    # Else use words_in_line.
                    for word in [w for i, w in enumerate(words_in_line) if i % 3 == 0]:
                            # No more words to search.
                            if not searchThese:
                                    return
                            # Search remaining words.
                            if word in searchThese:
                                    searchThese.remove(word)
                                    i = words_in_line.index(word)
                                    print words_in_line[i:i+3]

对于“访问”，“代数”，我得到以下结果：

['access'，'183'，'3145']
['代数'，'153'，'16272']

在python中读取文件中的下一个单词

问题描述

4 个解决方案

解决方案1
16 已采纳 2012-04-22 01:37:27

解决方案2
2 2012-04-22 01:40:04

解决方案3
1 2012-04-22 01:47:42

解决方案4
1 2012-04-22 11:49:19

在python中读取文件中的下一个单词

问题描述

4 个解决方案

解决方案1 16 已采纳 2012-04-22 01:37:27

解决方案2 2 2012-04-22 01:40:04

解决方案3 1 2012-04-22 01:47:42

解决方案4 1 2012-04-22 11:49:19

解决方案1
16 已采纳 2012-04-22 01:37:27

解决方案2
2 2012-04-22 01:40:04

解决方案3
1 2012-04-22 01:47:42

解决方案4
1 2012-04-22 11:49:19