[英]Creating histograms in pandas with columns with equidistant base, not proportional to the range
I am creating an histogram in pandas simply using:我只是使用以下方法在熊猫中创建直方图:
train_data.hist("MY_VARIABLE", bins=[0,5, 10,50,100,500,1000,5000,10000,50000,100000])
(train_data is a pandas df). (train_data 是一个熊猫 df)。
The problem is that, since the range [50000,100000]
is so large, I can barely see the small ranges [0,5]
or [5,10]
etc. I would like the histogram to have equidistant bars on the x-axis, not proportional to the range.问题是,由于范围
[50000,100000]
太大,我几乎看不到小范围[0,5]
或[5,10]
等。我希望直方图在 x- 上有等距条轴,与范围不成比例。 Is this possible?这可能吗?
You can do it this way:你可以这样做:
bins = [0, 5, 10,50,100,500,1000,5000,10000,50000,100000]
df.groupby(pd.cut(df.a, bins=bins, labels=bins[1:])).size().plot.bar(rot=0)
Demo:演示:
df = pd.DataFrame(np.random.randint(0,10**5,(10**4,2)),columns=list('ab'))
bins = [0, 5, 10,50,100,500,1000,5000,10000,50000,100000]
df.groupby(pd.cut(df.a, bins=bins, labels=bins[1:])).size().plot.bar(rot=0)
filtering results:过滤结果:
threshold = 100
(df.groupby(pd.cut(df.a,
bins=bins,
labels=bins[1:]))
.size()
.to_frame('count')
.query('count > @threshold')
)
Out[84]:
count
a
5000 396
10000 492
50000 4044
100000 4961
plotting filtered:绘图过滤:
(df.groupby(pd.cut(df.a,
bins=bins,
labels=bins[1:]))
.size()
.to_frame('count')
.query('count > @threshold')
.plot.bar(rot=0, width=1.0)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.