在python中找到匹配单词的上游5个单词

Question

我想在字符串中找到找到的匹配词的5个上游词。 例。 我有琴弦

这是有史以来最荒谬的老鼠

我要搜索“老鼠”，然后获取找到的“老鼠”字词上游的4个字

我尝试使用

re.search(r'\brat\b', " This is the most Absurd rat in the history")

但这给了我像span（25,28）这样的空间位置，但是我将如何使用它来获取单词。 如果我想知道单词的位置，那么我可以简单地得到4个索引的上/下单词。

Answer 1

(?:\\S+\\s){4}(?=rat\\b)可能与您想要的接近：

>>> sentence = "This is the most Absurd rat in the history"
>>> import re
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', sentence, re.I)
['is the most Absurd ']
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', "I like Bratwurst", re.I)
[]
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', "A B C D rat D E F G H rat", re.I)
['A B C D ', 'E F G H ']

这是一个例子。

Answer 2

您可以使用re.findall ：

s = "This is the most Absurd rat ever in the history"
print(re.findall('^[\w\W]+(?=\srat)', s)[0].split()[-4:])

输出：

['is', 'the', 'most', 'Absurd']

编辑2：

如果您要查找跟踪"rat"出现的四个词，则可以使用itertools.groupby ：

import itertools
s = "Some words go here rat This is the most Absurd rat final case rat"
new_data = [[a, list(b)] for a, b in itertools.groupby(s.split(), key=lambda x:x.lower() == 'rat')]
if any(a for a, _ in new_data): #to ensure that "rat" does exist in the string
  results = [new_data[i][-1][-4:] for i in range(len(new_data)-1) if new_data[i+1][0]]
  print(results)

输出：

[['Some', 'words', 'go', 'here'], ['is', 'the', 'most', 'Absurd'], ['final', 'case']]

Answer 3

编辑：由于要查找在rat之前出现的所有单词，因此需要使用更复杂的正则表达式的findall ：

import re
s = 'This is the most absurd rat ever in the history of rat kind I tell you this rat is ridiculous.'
answer = [sub.split() for sub in re.findall(r'((?:\S+\s*){4})rat', s)]
# [['is', 'the', 'most', 'absurd'],
#  ['in', 'the', 'history', 'of'],
#  ['I', 'tell', 'you', 'this']]

上一个答案：

您可以按rat split字符串：

import re
s = 'This is the most Absurd rat ever in the history'
answer = re.split(r'\brat\b', s, 1)[0].split()[-4:]
# => ['is', 'the', 'most', 'Absurd']

我假设上游是指之前，如果您是指之后，则将[0]更改为[1] ，将[-4:]更改为[:4] 。 您还需要添加一些代码来检查rat是否完全在字符串中，否则将中断。

在python中找到匹配单词的上游5个单词

问题描述

3 个解决方案

解决方案1
2 2018-09-06 19:27:50

解决方案2
1 已采纳 2018-09-06 19:10:08

解决方案3
1 2018-09-06 19:16:51

在python中找到匹配单词的上游5个单词

问题描述

3 个解决方案

解决方案1 2 2018-09-06 19:27:50

解决方案2 1 已采纳 2018-09-06 19:10:08

解决方案3 1 2018-09-06 19:16:51

解决方案1
2 2018-09-06 19:27:50

解决方案2
1 已采纳 2018-09-06 19:10:08

解决方案3
1 2018-09-06 19:16:51