简体   繁体   English

在Python中查找特定单词的句子索引(列表中的句子)

[英]Find the sentence’s index (sentences in a list) of a specific word in Python

i currently have a file that contains a list that is looks like 我目前有一个文件,其中包含一个看起来像

example = ['Mary had a little lamb' , 
       'Jack went up the hill' , 
       'Jill followed suit' ,    
       'i woke up suddenly' ,
       'it was a really bad dream...']

I would like to find the index of the sentence with the word “woke” by example. 我想通过示例找到单词“ woke”的句子索引。 In this example the answer should be f(“woke”)=3. 在此示例中,答案应为f(“ woke”)= 3。 F is a function. F是一个函数。

I tried to tokenize each sentence to first find the index of the word like that: 我试图标记每个句子以首先找到像这样的单词的索引:

>>> from nltk.tokenize import word_tokenize
>>> example = ['Mary had a little lamb' , 
...            'Jack went up the hill' , 
...            'Jill followed suit' ,    
...            'i woke up suddenly' ,
...            'it was a really bad dream...']
>>> tokenized_sents = [word_tokenize(i) for i in example]
>>> for i in tokenized_sents:
...     print i
... 
['Mary', 'had', 'a', 'little', 'lamb']
['Jack', 'went', 'up', 'the', 'hill']
['Jill', 'followed', 'suit']
['i', 'woke', 'up', 'suddenly']
['it', 'was', 'a', 'really', 'bad', 'dream', '...']

But I don't know how to finally get the index of the word and how to link it to the sentence's index. 但是我不知道如何最终获得单词的索引以及如何将其链接到句子的索引。 Does someone know how to do that? 有人知道该怎么做吗?

You can iterate over each string in the list, split on white space, then see if your search word is in that list of words. 您可以遍历列表中的每个字符串,在空白处分割,然后查看搜索单词是否在该单词列表中。 If you do this in a list comprehension, you can return a list of indices to the strings that satisfied this requirement. 如果您在列表理解中执行此操作,则可以将索引列表返回到满足此要求的字符串。

def f(l, s):
    return [index for index, value in enumerate(l) if s in value.split()]

>>> f(example, 'woke')
[3]
>>> f(example, 'foobar')
[]
>>> f(example, 'a')
[0, 4]

If you prefer using the nltk library 如果您更喜欢使用nltk

def f(l, s):
    return [index for index, value in enumerate(l) if s in word_tokenize(value)]
for index, sentence in enumerate(tokenized_sents):
    if 'woke' in sentence:
        return index

For all the sentences: 对于所有句子:

return [index for index, sentence in enumerate(tokenized_sets) if 'woke' in sentence]

If the requirement is to return the first sentence with the occurence of that word you can use something like - 如果要求返回出现该单词的第一句话,则可以使用-

def func(strs, word):
    for idx, s in enumerate(strs):
        if s.find(word) != -1:
            return idx
example = ['Mary had a little lamb' , 
       'Jack went up the hill' , 
       'Jill followed suit' ,    
       'i woke up suddenly' ,
       'it was a really bad dream...']
func(example,"woke")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python用于获取句子列表和单词列表,如果匹配则返回句子的索引 - python for take a list of sentences and a list of words, return the index of the sentence if there is a match 从具有精确单词匹配的句子列表中获取句子:Python - get sentence from list of sentences with exact word match : Python 模糊匹配以找到句子Python中单词的索引 - Fuzzy matching to find the index of a word in a sentence Python 在 Python 中查找具有给定单词的特定句子 - Find a specific sentence with a given word in Python 在句子中找到一个单词并将整个句子替换为一个数字 - find a word in the sentences and replace whole sentence with a number 查找句子与句子列表之间的相似性 - Find similarity between a sentence to a list of sentences Python:如何删除以特定单词开头的句子 - Python: How to remove sentences starting with a specific word(s) Python:用单词列表替换句子中的一个单词,并将新句子放在 pandas 的另一列中 - Python: Replace one word in a sentence with a list of words and put thenew sentences in another column in pandas Python:将句子列表中的每个句子切成薄片 - Python : Slice each sentence in list of sentences 在另一个列表的句子中查找列表中的单词,并在Python 2.7中将其替换 - Find a word of a list in a sentence of another list and replace it in Python 2.7
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM