简体   繁体   English

如何从键值列表中的句子中搜索关键字,并得到带有相对引用的句子的匹配结果?

[英]How to search keywords from sentences in a key-value list and get the matched result of the sentences with relative references?

I would like to search keywords from sentence with the data is in key-value list style, and return the matched sentences with the sentences references.我想从具有键值列表样式的数据的句子中搜索关键字,并返回带有句子引用的匹配句子。 I have been working on the checkSentence().我一直在研究 checkSentence()。 I know how to write to get the result only for quote:我知道如何编写以获得仅用于报价的结果:

def checkSentence(quote_list, searchItems):
    result_sentence = [all([searchingWord in searchingSentence for searchingWord in searchItems]) for searchingSentence in quote_list]
    return [quote_list[i] for i in range(0, len(result_sentence)) if result_sentence[i]]

checkResult = checkSentence(quote_list, searchItems)
quoteResult_list = []
for quote in checkResult:
    quoteResult_list.append(quote)

print(len(quoteResult_list))
print(quoteResult_list)

And now I would like to make it to get the sentences ("content" in the data, let say) with the references (the "article").现在我想用参考文献(“文章”)来获取句子(数据中的“内容”,比如说)。 It would be something like ["The world is made of sweet.":"The World"].就像[“世界是由甜蜜组成的。”:“世界”]。

There should be two for-loops that the first layer is the sentences searching and the second for-loop should be getting the "article" of the sentences.应该有两个for循环,第一层是句子搜索,第二层for循环应该是获取句子的“文章”。 I have no idea why it doesn't work?我不知道为什么它不起作用? It looks like the error is at item_list["quote"] and item_list["article"]?看起来错误是在 item_list["quote"] 和 item_list["article"]? Many thanks!非常感谢!

The code is as below:代码如下:

import json
import os

#    data part
data = {
    "title": "Vulnerable",
    "items": [
        {
            "article": "The World",
            "content": [
                "The world is made of sweet.",
                "The sweet tastes so good.",
            ]
        },
        {
            "article": "The Disaster",
            "content": [
                "That is the sweet wrapping with poison.",
                "Is that true? Are you kidding?",
            ]
        },
        {
            "article": "The Truth",
            "content": [
                "Trust me. That is not sweet!",
                "You see? That is why!",
            ]
        }
    ]
}

#    keywords for searching
searchItems = ["sweet", "is"]

#    deal with data to list
item_list = []
quote_list = []
article_list = []

for item in data["items"]:
    article = item["article"]
    for quote in item["content"]:
        item_list.append({article, quote})
        quote_list.append(quote)
        article_list.append(article)


#    check if sentences include keywords
def checkSentence(item_list, searchItems):
    for sentence in item_list["quote"]:
        result_sentence = [all([searchingWord in searchingSentence for searchingWord in searchItems]) for searchingSentence in sentence]
        sententceResult = [item_list[i] for i in range(0, len(result_sentence)) if result_sentence[i]] 
        for article in item_list["article"]:
            return_article = [all([searchingWord in searchingSentence for searchingWord in searchItems]) for searchingSentence in article]
            quoteResult = [item_list[i] for i in range(0, len(return_article)) if return_article[i]]
    return sententceResult, quoteResult

#    make the searching result as list item
checkResult = checkSentence(item_list, searchItems)
quoteResult_list = []
for quote in checkResult:
    quoteResult_list.append(quote)
print(quoteResult_list)

When adding the articles and quotes to item_list use tuples, as it makes it easier when finding the matches:将文章和引号添加到item_list使用元组,因为它可以更轻松地查找匹配项:

item_list.append((article, quote))

Now to the checkSentence function:现在到checkSentence函数:

Because now we're using tuples, we can save both the article and corresponding sentence simultaneously.因为现在我们使用元组,我们可以同时保存文章和相应的句子。 Then you only have to search for the key words in sentence and, if it matches, add both article and sentence to the matches list.然后你只需要搜索sentence的关键词,如果匹配,将articlesentence都添加到matches列表中。 Afterwards you just return the list with he results.之后,您只需返回带有结果的列表。

Here's the final code (without the data):这是最终的代码(没有数据):

#    keywords for searching
searchItems = ["sweet", "is"]

#    deal with data to list
item_list = []
quote_list = []
article_list = []

for item in data["items"]:
    article = item["article"]
    for quote in item["content"]:
        # use tuples to make later use easier
        item_list.append((article, quote))
        quote_list.append(quote)
        article_list.append(article)

#    check if sentences include keywords
def checkSentence(item_list, searchItems):
    matches = []
    # unpack the tuples and iterate over the list
    for article, sentence in item_list:
        # check for matching key words in sentence
        if all([searchingWord in sentence for searchingWord in searchItems]):
            # add both article and sentence to the matches, if key words are present
            matches.append((article, sentence))
    return matches

#    make the searching result as list item
checkResult = checkSentence(item_list, searchItems)
print(checkResult)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM