简体   繁体   English

使用向量化函数对numpy float数组进行重新分类时出现广播错误

[英]Broadcasting error when reclassifying numpy float array using vectorized function

I want to evaluate each value in a 2D numpy float array if it falls within the min, max boundaries of a certain numerical class. 我想评估2D numpy浮点数组中的每个值是否在某个数值类的最小,最大边界内。 Next, I want to reassign that value to the 'score' associated with that class. 接下来,我想将该值重新分配给与该类关联的“得分”。

Eg the class boundaries could be: 例如,类边界可以是:

>>> class1 = (0, 1.5)
>>> class2 = (1.5, 2.5)
>>> class3 = (2.5, 3.5)

The class scores are: 全班分数是:

>>> score1 = 0.75
>>> score2 = 0.50
>>> score3 = 0.25

Values outside any of the classes should default to eg 99. 任何类之外的值都应默认为例如99。

I've tried the following, but run into a ValueError due to broadcasting. 我尝试了以下操作,但是由于广播而遇到ValueError。

>>> import numpy as np

>>> arr_f = (6-0)*np.random.random_sample((4,4)) + 0  # array of random floats


>>> def reclasser(x, classes, news):
>>>     compare = [x >= min and x < max for (min, max) in classes]
>>>     try:
>>>         return news[compare.index(True)
>>>     except Value Error:
>>>         return 99.0


>>> v_func = np.vectorize(reclasser)
>>> out = v_func(arr_f, [class1, class2, class3], [score1, score2, score3])

ValueError: operands could not be broadcast together with shapes (4,4) (4,2) (4,) 

Any suggestions on why this error occurs and how to remediate would be most appreciated. 对于为什么会出现此错误以及如何进行纠正的任何建议,将不胜感激。 Also, if I'm entirely on the wrong path using vectorized functions, I'd also be happy to hear that. 另外,如果我完全使用向量化函数走错了道路,那我也很高兴听到。

Try to first make the code work without using np.vectorize . 尝试首先使代码在不使用np.vectorize情况下工作。 The code above won't work even with a single float as first argument. 上面的代码即使只有一个浮点数作为第一个参数也无法使用。 You misspelled ValueError ; 你拼错了ValueError ; also it's not a good idea to use min and max as variable names (they are Python functions). 使用minmax作为变量名(它们是Python函数)也不是一个好主意。 A fixed version of reclasser would be: 修正版reclasser将是:

def reclasser(x, classes, news):
    compare = [min(cls) < x < max(cls) for cls in classes]
    try:
        return news[compare.index(True)]
    except ValueError:
        return 99.0

That said, I think using the reclasser and np.vectorize is unnecessarily complex. 就是说,我认为使用reclasser和np.vectorize不必要地复杂。 Instead, you could do something like: 相反,您可以执行以下操作:

# class -> score mapping as a dict
class_scores = {class1: score1, class2: score2, class3: score3}
# matrix of default scores
scores = 99 * np.ones(arr_f.shape)

for cls, score in class_scores.items():
    # see which array values belong into current class
    in_cls = np.logical_and(cls[0] < arr_f, arr_f < cls[1])
    # update scores for current class
    scores[np.where(in_cls)] = score

scores will then be an array of scores corresponding to the original data array. scores将是对应于原始数据数组的分数数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM