簡體 English 中英

繪制詞頻和 NLTK

[英]Plotting words frequency and NLTK

原文 2015-04-20 18:40:44 4 1 python/ matplotlib/ nltk

我有一個包含各種單詞的文件，我想計算文檔中每個單詞的頻率並繪制它。 但是，我的情節沒有顯示結果。 x-axis必須包含單詞， y-axis x-axis必須包含頻率。 我正在使用NLTK 、 NumPy和Matplotlib

這是我的代碼，也許我做錯了什么

def graph():
    f = open("file.txt", "r")
    inputfile = f.read()
    words = nltk.tokenize.word_tokenize(inputfile)
    count = set(words)
    dic = nltk.FreqDist(words)
    FreqDist(f).plot(50, cumulative=False)
    f.close()

給定文件file.txt中的單詞列表：

southbound
stopped
travel
lane
started
around
stopped
stopped
started

1 個解決方案

import nltk

def graph():
    with open("file.txt", "r") as f:
        inputfile = f.read()
    tokens = nltk.tokenize.word_tokenize(inputfile)
    fd = nltk.FreqDist(tokens)
    fd.plot(30,cumulative=False)

graph()

您可以通過更改 plot() 的參數來玩圖

使用 NLTK 計算語料庫中單詞列表的頻率

[英]Count frequency of list of words in corpus using NLTK

如何使用NLTK在CSV文件中查找特定單詞的頻率分布

[英]How to use NLTK to find the frequency distribution of specific words in a csv file

使用Groupby的數據框列中標記詞的Python Pandas NLTK頻率分布

[英]Python Pandas NLTK Frequency Distribution for Tokenized Words in Dataframe Column with a Groupby

如何使用 nltk 計算文本中存在的單詞的頻率

[英]How to count the frequency of words existing in a text using nltk

Python NLTK FreqDist - 列出頻率大於 1000 的單詞

[英]Python NLTK FreqDist - Listing words with a frequency greater than 1000

從頻率中排序的NLTK中的Text.similar（）和ContextIndex.similar_words（）生成的單詞？

[英]Words generated from Text.similar() and ContextIndex.similar_words() in NLTK sorted by frequency?

NLTK - Bigram的計數頻率

[英]NLTK - Counting Frequency of Bigram

在 Python 中使用 NLTK 的條件頻率分布計算語料庫中的單詞總數（新手）

[英]Count total number of words in a corpus using NLTK's Conditional Frequency Distribution in Python (newbie)

繪制兩個nltk freqdists

[英]Plotting two nltk freqdists

用NLTK計算術語“頻率-逆文檔頻率”

[英]Compute the term frequency–inverse document frequency with NLTK

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 使用 NLTK 計算語料庫中單詞列表的頻率如何使用NLTK在CSV文件中查找特定單詞的頻率分布使用Groupby的數據框列中標記詞的Python Pandas NLTK頻率分布如何使用 nltk 計算文本中存在的單詞的頻率 Python NLTK FreqDist - 列出頻率大於 1000 的單詞從頻率中排序的NLTK中的Text.similar（）和ContextIndex.similar_words（）生成的單詞？ NLTK - Bigram的計數頻率在 Python 中使用 NLTK 的條件頻率分布計算語料庫中的單詞總數（新手）繪制兩個nltk freqdists 用NLTK計算術語“頻率-逆文檔頻率”

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM