繁体   English   中英

带有 generate_from_frequencies 的 Wordcloud Python

[英]Wordcloud Python with generate_from_frequencies

我正在尝试从 csv 文件创建一个 wordcloud。 例如,csv 文件具有以下结构:

a,1
b,2
c,4
j,20

它有更多行,或多或少 1800。第一列有字符串值(名称),第二列有它们各自的频率(整数)。 然后,读取文件并将键值行存储在字典 (d) 中,因为稍后我们将使用它来绘制 wordcloud:

reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
    d[k] = v

一旦我们的字典充满了值,我就会尝试绘制 wordcloud:

#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
    wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis("off")
    plt.show()
But an error is thrown:

    Traceback (most recent call last):
    File ".........../script.py", line 19, in <module>
    wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
    File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line  360, in generate_from_frequencies
    for word, freq in frequencies]
    File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line 360, in <listcomp>
    for word, freq in frequencies]
    TypeError: unsupported operand type(s) for /: 'str' and 'float

最后,文档说:

def generate_from_frequencies(self, frequencies, max_font_size=None):
    """Create a word_cloud from words and frequencies.
    Parameters
    ----------
    frequencies : dict from string to float
        A contains words and associated frequency.
    max_font_size : int
        Use this font-size instead of self.max_font_size
    Returns
    -------
    self
```python

So, I don't understand why is trowing me this error if I met the requirements of the function. I hope someone can help me, thanks.

**Note**

I work with worldcloud 1.3.1

这是因为字典中的值是字符串,但 wordcloud 需要整数或浮点数。

当我运行代码,然后检查你的字典d我得到以下。

In [12]: d

Out[12]: {'a': '1', 'b': '2', 'c': '4', 'j': '20'}

请注意数字周围的' '表示这些实际上是字符串。

解决这个问题的一个技巧是在你的FOR循环中将v转换为int ,例如:

d[k] = int(v)

我说这很笨拙,因为它可以处理整数,但是如果您的输入中有浮点数,则可能会导致问题。

此外,Python 错误可能难以阅读。 您上面的错误可以解释为

script.py", line 19

TypeError: unsupported operand type(s) for /: 'str' and 'float

“我的文件的第 19 行或之前存在类型错误。让我看看我的数据类型,看看字符串和浮点数之间是否有任何不匹配......”

下面的代码对我有用:

import csv
from wordcloud import WordCloud
import matplotlib.pyplot as plt

reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
    d[k] = int(v)

#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
# LEARNER CODE START HERE
file_c=""
for index, char in enumerate(file_contents):
    if(char.isalpha()==True or char.isspace()):
        file_c+=char
file_c=file_c.split()
file_w=[]
for word in file_c:
    if word.lower() not in uninteresting_words and word.isalpha()==True:
    file_w.append(word)
frequency={}
for word in file_w:
    if word.lower() not in frequency:
        frequency[word.lower()]=1
    else:
        frequency[word.lower()]+=1
#wordcloud
cloud = wordcloud.WordCloud()
cloud.generate_from_frequencies(frequency)
return cloud.to_array()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM