如何在使用python在文本中找到关键字后提取一些先词

Question

I have a keyword "grand master" and I am searching for the keyword in the huge text. 我有一个关键字“ grand master”，并且正在大文本中搜索该关键字。 I need to extract 5 before words and 5 after words of the keyword (based on the position it might go to the next/before sentence also), and this keyword appears multiple times in huge text. 我需要提取该关键字的单词前5个单词和单词后5个单词（根据位置，它也可能转到下一个/句子之前的位置），并且此关键字在大文本中多次出现。

As a trail , first i tried to find the position of the keyword in the text, using text.find() , and found the keywords at 4 different positions 首先，我尝试使用text.find()查找关键字在文本中的位置，并在4个不同的位置找到关键字

>>positions
>>[125, 567,34445, 98885445]

So tried to split the text based on spaces and taking first 5 words, 因此，我们尝试根据空格分割文本并采用前5个单词，

text[positions[i]:].split([len(keyword.split()):len(keyword.split())+5]

But how to extract the 5 words before that keyword? 但是，如何提取该关键字之前的5个单词？

Answer 1

你可以简单地使用

text[:position[i]].split()[-5:]

Answer 2

Use re module for this. 为此使用re模块。 For the first keyword match: 对于第一个关键字匹配：

pattern = "(.+) (.+) (.+) (.+) (.+) grand master (.+) (.+) (.+) (.+) (.+)"
match = re.search(pattern, text)
if match:
    firstword_before = match.group(1) # first pair of parentheses
    lastword_before = match.group(5)

    firstword_after = match.group(6)
    lastword_after = match.group(10)

Parentheses in the pattern indicates the group number. 模式中的括号表示组号。 First pair of parentheses corresponds to match.group(1), second pair of parentheses corresponds to match.group(2) and so on. 第一对括号对应于match.group（1），第二对括号对应于match.group（2），依此类推。 If you want all the groups you can use: 如果需要所有组，可以使用：

match.groups() # returns tuple of groups

or 要么

match.group(0) # returns string of groups

For all the keyword match in the text, use re.findall. 对于文本中所有匹配的关键字，请使用re.findall。 Read re for details. 阅读重新了解详情。

PS: There are better ways to write patterns. PS：有更好的方式来编写模式。 Thats just me being lazy. 那只是我偷懒。

如何在使用python在文本中找到关键字后提取一些先词

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-10-09 14:50:47

解决方案2
0 2018-10-09 15:15:30

如何在使用python在文本中找到关键字后提取一些先词

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-10-09 14:50:47

解决方案2 0 2018-10-09 15:15:30

解决方案1
1 已采纳 2018-10-09 14:50:47

解决方案2
0 2018-10-09 15:15:30