简体   繁体   English

单个关键字中的多个单词并在python中的数据中计算它们

[英]Multiple words in single keyword and counting them in the data in python

I'm trying to run the following code in python in order to count the keywords in the specific values of my dictionary.我正在尝试在 python 中运行以下代码,以便计算字典特定值中的关键字。 Suppose my keywords = ['is', 'my'] and it works fine for me but when my keywords are keywords = ['is', 'my name'] then it doesn't count the keyword my name .假设我的keywords = ['is', 'my']它对我来说很好,但是当我的关键字是keywords = ['is', 'my name']时,它不计算关键字my name I don't know what mistake I'm doing.我不知道我在做什么错误。 if anyone can see the code and help me out.如果有人可以看到代码并帮助我。 thank you谢谢你

from collections import Counter
import json 
from typing import List, Dict


keywords = ['is', 'my name']

def get_keyword_counts(text: str, keywords: List[str]) -> Dict[str, int]:
    return {
        word: count for word, count in Counter(text.split()).items()
        if word in set(keywords)
    }

    data = {
        "policy": {
            "1": {
                "ID": "ML_0",
                "URL": "www.a.com",
                "Text": "my name is Martin and here is my code"
            },
            "2": {
                "ID": "ML_1",
                "URL": "www.b.com",
                "Text": "my name is Mikal and here is my code"
            }
        }
    }
    
    for policy in data['policy'].values():
        policy.update(get_keyword_counts(policy['Text'], keywords))
    print(json.dumps(data, indent=4))

The substring "my name" is also splitted in get_keyword_counts so there is no actual value "my name", they are apart: "my" and "name".子字符串“my name”也在 get_keyword_counts 中拆分,因此没有实际值“my name”,它们是分开的:“my”和“name”。 I guess you want to count it as a whole, so there is what you need:我猜你想把它作为一个整体来计算,所以你需要:

def get_keyword_counts(text: str, keywords: List[str]) -> Dict[str, int]:
    return {
        word: text.count(word) for word in keywords
    }

您正在使用text.split() ,它最终将"my""name"分开,所以改为使用count()并且应该这样做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM