简体   繁体   中英

How can implement the function to count the frequency of that word in the text in percentage

I want a function that calculates how often a given word occurs in a text, and expresses the result as a percentage. I want to read from fail and then return the frequent word with percentage.

import re


words = re.findall(r"\w+", text)
frequencies = most_common(words)
percentages = [(instance, count / len(words)) for instance, count in frequencies]

for word, percentage in percentages:
    print("%s %.2f%%" % (word, percentage * 100))


NameError: name 'most_common' is not defined

I'd like to pass any word to the function and the function will count the frequency of that word in the text file

You can try something like this:

import re
from collections import Counter


def frequency_in_text(word, text):
    words = re.findall(r"\w+", text)
    total_len = len(words)

    frequencies = dict()
    for string, freq in Counter(words).items():
        frequencies[string] = freq / total_len * 100

    return frequencies.get(word)

You can utilize pandas.Series.value_counts() method:

import pandas as pd

def word_counter(text):
    words = pd.Series(re.findall(r"\w+", text))
    frequencies = words.value_counts(normalize=True)
    return frequencies

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM