简体   繁体   English

如何在 pandas.cut 中打印类别?

[英]How to print categories in pandas.cut?

Notice that when you input pandas.cut into a dataframe, you get the bins of each element, Name:, Length:, dtype:, and Categories in the output.请注意,当您将 pandas.cut 输入数据帧时,您会在输出中获得每个元素的 bin、Name:、Length:、dtype: 和 Categories。 I just want the Categories array printed for me so I can obtain just the range of the number of bins I was looking for .我只想为我打印 Categories 数组,这样我就可以获得我正在寻找的 bin 数量的范围 For example, with bins=4 inputted into a dataframe of numbers "1,2,3,4,5", I would want the output to print solely the range of the four bins, ie (1, 2], (2, 3], (3, 4], (4, 5].例如,将 bins=4 输入到数字“1,2,3,4,5”的数据帧中,我希望输出仅打印四个 bin 的范围,即 (1, 2], (2, 3], (3, 4], (4, 5]。

Is there anyway I can do this?无论如何我可以做到这一点吗? It can be anything, even if it doesn't require printing "Categories".它可以是任何东西,即使它不需要打印“类别”。

I guessed that you just would like to get the 'bins' from pd.cut() .我猜您只是想从pd.cut()中获取“垃圾箱” If so, you can simply set retbins=True , see the doc of pd.cut For example:如果是这样,您可以简单地设置retbins=True ,请参阅pd.cut的文档例如:

In[01]:在[01]:

data = pd.DataFrame({'a': [1, 2, 3, 4, 5]})
cats, bins = pd.cut(data.a, 4, retbins=True)

Out[01]:输出[01]:

cats : cats

0    (0.996, 2.0]
1    (0.996, 2.0]
2      (2.0, 3.0]
3      (3.0, 4.0]
4      (4.0, 5.0]
Name: a, dtype: category
Categories (4, interval[float64]): [(0.996, 2.0] < (2.0, 3.0] < (3.0, 4.0] < (4.0, 5.0]]

bins : bins

array([0.996, 2.   , 3.   , 4.   , 5.   ])

Then you can reuse the bins as you pleased.然后,您可以随意重复使用这些bins eg,例如,

lst = [1, 2, 3]
category = pd.cut(lst,bins)

For anyone who has come here to see how to select a particular bin from pd.cut function - we can use the pd.Interval funtcion对于来这里查看如何从pd.cut函数中选择特定 bin 的任何人 - 我们可以使用pd.Interval函数

df['bin'] = pd.cut(df['y'], [0.1, .2,.3,.4,.5, .6,.7,.8 ,.9])
print(df["bin"].value_counts())

Ouput
(0.2, 0.3]    697
(0.4, 0.5]    156
(0.5, 0.6]    122
(0.3, 0.4]     12
(0.6, 0.7]      8
(0.7, 0.8]      4
(0.1, 0.2]      0
(0.8, 0.9]      0
print(df.loc[df['bin'] ==  pd.Interval(0.7,0.8)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM