简体   繁体   English

通过每个仓中数据的“标签分数”对二维直方图进行颜色显示

[英]2D histogram colour by “label fraction” of data in each bin

Following on from the post found here: 2D histogram coloured by standard deviation in each bin 接下来是在这里找到的帖子: 2D直方图,每个分格中的标准偏差都带有颜色

I would like to colour each bin in a 2D grid by the fraction of points whose label values are below a certain threshold in Python. 我想用标签值低于Python中某个阈值的点的分数为2D网格中的每个bin着色。

Note that, in this dataset, each point has a continuous label value between 0-1. 请注意,在此数据集中,每个点的连续标签值介于0-1之间。

For example here is a histogram I made whereby the colour denotes the standard deviation of label values of all points in each bin: 例如,这是我制作的直方图,其中颜色表示每个仓中所有点的标签值的标准偏差:

在此处输入图片说明

The way this was done was by using 这样做的方式是通过使用

scipy.stats.binned_statistic_2d()

(see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic_2d.html ) (请参阅: https : //docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic_2d.html

..and setting the statistic argument to 'std' ..并将统计参数设置为“ std”

But is there a way to change this kind of plot so that the colouring is representative of the fraction of points in each bin with label value below 0.5 for example? 但是,是否有一种方法可以更改这种图,以使着色表示每个容器中点的分数,例如标签值低于0.5?

It could be that the only way to do this is by explicitly defining a grid of some kind and calculating the fractions but I'm not sure of the best way to do that so any help on this matter would be greatly appreciated! 可能唯一的方法是明确定义某种网格并计算分数,但是我不确定做到这一点的最佳方法,因此,对此问题的任何帮助将不胜感激!

Maybe using scipy.stats.binned_statistic_2d or numpy.histogram2d and being able to return the raw data values in each bin as a multi dimensional array would help in being able to quickly compute the fractions explicitly. 也许使用scipy.stats.binned_statistic_2d或numpy.histogram2d并能够将每个bin中的原始数据值作为多维数组返回,将有助于快速明确地计算分数。

The fraction of elements in an array below a threshold can be calculated as 低于阈值的数组中的元素比例可以计算为

fraction = lambda a, threshold: len(a[a<threshold])/len(a)

Hence you can call 因此,您可以致电

scipy.stats.binned_statistic_2d(x, y, values, statistic=lambda a: fraction(a, 0.5)) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM