Python wordcloud 中 generate_from_frequencies 方法所需的元组数组

Question

I am trying to make a word cloud in Python from the significance of strings and their corresponding data values in an Excel document.我正在尝试根据字符串的重要性及其在 Excel 文档中的相应数据值在 Python 中制作一个词云。 The generate_from_frequencies method takes a frequencies parameter which the docs say is supposed to take an array of tuples. generate_from_frequencies 方法采用频率参数，文档说该参数应该采用元组数组。

Partial code from wordcloud source code :来自wordcloud源代码的部分代码：

def generate_from_frequencies(self, frequencies):
    """Create a word_cloud from words and frequencies.
    Parameters
    ----------
    frequencies : array of tuples
        A tuple contains the word and its frequency.
    Returns
    -------
    self
    """
    # make sure frequencies are sorted and normalized
    frequencies = sorted(frequencies, key=item1, reverse=True)
    frequencies = frequencies[:self.max_words]
    # largest entry will be 1
    max_frequency = float(frequencies[0][1])

    frequencies = [(word, freq / max_frequency) for word, freq in frequencies]

I tried using a regular list, then I tried a ndarray from numpy, but PyCharm shows warnings that the parameter type should be array.py, which I read is only supposed to take characters, integers, and floating point numbers ( array.py docs ):我尝试使用常规列表，然后尝试使用 numpy 中的 ndarray，但 PyCharm 显示参数类型应为 array.py 的警告，我读取的参数类型仅应采用字符、整数和浮点数（ array.py 文档):

This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers.该模块定义了一个 object 类型，它可以紧凑地表示一组基本值：字符、整数、浮点数。

My test code:我的测试代码：

import os
import numpy
import wordcloud

d = os.path.dirname(__file__)
cloud = wordcloud.WordCloud()
array = numpy.array([("hi", 6), ("seven"), 17])
cloud.generate_from_frequencies(array)  # <= what should go in the parentheses

If I run the code above despite the PyCharm warning, I get the following error, which I suppose is another way of telling me that it can't accept the ndarray type:如果我不顾 PyCharm 警告运行上面的代码，我会收到以下错误，我想这是告诉我它不能接受 ndarray 类型的另一种方式：

  File "C:/Users/Caitlin/Documents/BioDataSorter/tag_cloud_test.py", line 8, in <module>
cloud.generate_from_frequencies(array)  # <= what should go in the parentheses
  File "C:\Python34\lib\site-packages\wordcloud\wordcloud.py", line 263, in generate_from_frequencies
frequencies = sorted(frequencies, key=item1, reverse=True)
TypeError: 'int' object is not subscriptable

Another potential problem could be that wordcloud was written in Python 2 but I am using Python 3.4, which may have rendered some of the code unusable.另一个潜在的问题可能是 wordcloud 是用 Python 2 编写的，但我使用的是 Python 3.4，这可能导致某些代码无法使用。 What type should I pass this method?我应该通过什么类型的方法？

Answer 1

From your test code ... # <= what should go in this parentheses 从您的测试代码... # <=这个括号中应该包含什么

I believe you should have a tuple (("hi", float(6/(6+17)),("seven", float(17/(6+17)))) 我相信你应该有一个元组(("hi", float(6/(6+17)),("seven", float(17/(6+17))))

Answer 2

Thanks to J Herron and selva for the answer to use tuples instead of a list object-- and I ended up with this: 感谢J Herron和selva回答使用元组而不是列表对象 - 我最终得到了这个：

cloud.generate_from_frequencies((("hi", 3),("seven", 7)))

It still came up as an error in my IDE, which was misleading, but it worked the way it was supposed to. 它仍然在我的IDE中出现错误，这是一种误导，但它按照预期的方式工作。

Answer 3

Building on CCCodes' answer, here is the new version of the provided way with the weight mapped to the word in a dict:基于 CCCodes 的回答，这里是所提供方式的新版本，权重映射到字典中的单词：

cloud.generate_from_frequencies({"hi": 3,"seven": 7})

Python wordcloud 中 generate_from_frequencies 方法所需的元组数组

问题描述

3 个解决方案

解决方案1
0 2016-08-24 11:02:47

解决方案2
0 已采纳 2016-08-26 18:05:01

解决方案3
0 2022-04-03 14:35:50

Python wordcloud 中 generate_from_frequencies 方法所需的元组数组

问题描述

3 个解决方案

解决方案1 0 2016-08-24 11:02:47

解决方案2 0 已采纳 2016-08-26 18:05:01

解决方案3 0 2022-04-03 14:35:50

解决方案1
0 2016-08-24 11:02:47

解决方案2
0 已采纳 2016-08-26 18:05:01

解决方案3
0 2022-04-03 14:35:50