循环遍历字符串项列表并返回包含python中子字符串的字符串项

Question

I am attempting to loop through a list of sentences and only pull items in the list that contain a sub-string (keyword), when using return instead of yield in my function i get back a list of characters vs yield i get full sentences but I know it is a generator and want a full list of every sentence that contains the word. 我试图循环一个句子列表，只在列表中拉出包含子字符串（关键字）的项目，当在我的函数中使用return而不是yield时，我得到一个字符列表vs yield我得到完整句子但是我知道它是一个生成器，想要包含该单词的每个句子的完整列表。 Is it the .find() causing the issue or is there a better way to pull from a list of string items? 是.find（）导致问题还是有更好的方法从字符串项列表中提取？

import nltk
from nltk import *
import pandas as pd
f= open("filename.txt").read()
sent_list = sent_tokenize(f)

hunt = "youth" #keyword i'm searching for
def hunter(sent):
    for term in sent:
        if term.find(hunt) is not -1:
            yield term

complete_lst = [term for term in hunter(sent_list)]
df = pd.DataFrame({'key_term_sentences':complete_lst})

Answer 1

There are couple of bugs in your code of which not using split is one. 您的代码中有几个错误，其中不使用split是一个。 After fixing that, everything works fine. 修好后，一切正常。 Below is a working example: 以下是一个工作示例：

In [31]: sent_list = ['this is first sentence for demo purposes', 
                      'this is second sentence containing youth and youthful', 
                      'this is 3rd sentence which is dummy one btw']

In [32]: hunt = 'youth'

# note that we need two `for` loops since the function takes list of sentences
In [33]: def hunter(sent_list):
    ...:     for sent in sent_list:
    ...:         for term in sent.split():
    ...:             if hunt in term:
    ...:                 yield term
    ...:                 

In [34]: list(hunter(sent_list))
Out[34]: ['youth', 'youthful']

Just to demonstrate that you can also use term.find(hunt) as you're using it already: 只是为了证明你也可以使用term.find(hunt)因为你已经使用它了：

In [35]: def hunter(sent_list):
    ...:     for sent in sent_list:
    ...:         for term in sent.split():
    ...:             if term.find(hunt) is not -1:
    ...:                 yield term
    ...:                 

In [36]: list(hunter(sent_list))
Out[36]: ['youth', 'youthful']

Answer 2

An even easier way would be to .split the each sentence into a list of individual sentences. 一种更简单的方法是将.split每个句子译成个别句子的列表。 From there you can iterate through each one, split it, and check if the word is in the sentence. 从那里你可以遍历每一个，拆分它，并检查单词是否在句子中。

hunt = "youth"
def hunter(sent):
    sentences = sent.split('.')
    for each in sentences:
        check = each.split(' ')
        for word in check:
            if word = hunt:
                print each

循环遍历字符串项列表并返回包含python中子字符串的字符串项

问题描述

2 个解决方案

解决方案1
1 2018-07-15 15:10:20

解决方案2
0 2018-07-15 14:29:20

循环遍历字符串项列表并返回包含python中子字符串的字符串项

问题描述

2 个解决方案

解决方案1 1 2018-07-15 15:10:20

解决方案2 0 2018-07-15 14:29:20

解决方案1
1 2018-07-15 15:10:20

解决方案2
0 2018-07-15 14:29:20