在 Numpy 中分箱

Question

I have an array A which I am trying to put into 10 bins.我有一个数组 A，我试图将其放入 10 个 bin 中。 Here is what I've done.这是我所做的。

A = range(1,94)
hist = np.histogram(A, bins=10)
np.digitize(A, hist[1])

But the output has 11 bins, not 10, with the last value (93) placed in bin 11, when it should have been in bin 10. I can fix it with a hack, but what's the most elegant way of doing this?但是输出有 11 个 bin，而不是 10，最后一个值 (93) 放在 bin 11 中，而它应该在 bin 10 中。我可以用 hack 修复它，但是最优雅的方法是什么？ How do I tell digitize that the last bin in hist[1] is inclusive on the right - [ ] instead of [ )?我如何告诉数字化 hist[1] 中的最后一个 bin 包含在右侧 - [ ] 而不是 [ ) ？

Answer 1

The output of np.histogram actually has 10 bins; np.histogram的输出实际上有 10 个 bin； the last (right-most) bin includes the greatest element because its right edge is inclusive (unlike for other bins).最后一个（最右边的）bin 包含最大的元素，因为它的右边缘是包含的（与其他 bin 不同）。

The np.digitize method doesn't make such an exception (since its purpose is different) so the largest element(s) of the list get placed into an extra bin. np.digitize方法不会产生这样的例外（因为它的目的不同）所以列表中最大的元素被放置到一个额外的 bin 中。 To get the bin assignments that are consistent with histogram , just clamp the output of digitize by the number of bins, using fmin .要获得与histogram一致的 bin 分配，只需使用fmin将digitize的输出限制为 bin 的数量。

A = range(1,94)
bin_count = 10
hist = np.histogram(A, bins=bin_count)
np.fmin(np.digitize(A, hist[1]), bin_count)

Output:输出：

array([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,
        2,  2,  3,  3,  3,  3,  3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  4,
        4,  4,  4,  5,  5,  5,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,
        6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,
        8,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9,  9,  9,  9, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10])

在 Numpy 中分箱

问题描述

1 个解决方案

解决方案1
3 已采纳

在 Numpy 中分箱

问题描述

1 个解决方案

解决方案1 3 已采纳

解决方案1
3 已采纳