[英]Convert column in dataframe to “classes”?
So I've essentially got this dataframe: 所以我基本上得到了这个数据帧:
,club_name,tr_begin,year,ranking
0,ADO Den Haag,1357,2010,6.0
1,ADO Den Haag,1480,2011,15.0
2,ADO Den Haag,1397,2012,9.0
3,ADO Den Haag,1384,2013,9.0
4,ADO Den Haag,1451,2014,13.0
What I want to do is this, I want to go through every ranking and put them into a class based on it's value. 我想要做的就是这个,我想通过每个排名,并根据它的价值将它们放入一个类。 So a ranking of 6 would go into class number 2 and a ranking 1 would go into class number 1. The conversion table is this:
所以排名6将进入第2类,排名1将进入第1类。转换表如下:
if ranking > 0 and ranking =< 3:
rank_class = 1
if ranking > 3 and ranking =< 6:
rank_class = 2
etc etc etc
This I would like to happen in multiples of 3 up until 18. 我希望以3的倍数发生直到18。
So my hoped output would be: 所以我希望的输出是:
,club_name,tr_begin,year,ranking, ranking_class
0,ADO Den Haag,1357,2010,6.0, 2
1,ADO Den Haag,1480,2011,15.0, 5
2,ADO Den Haag,1397,2012,9.0, 3
3,ADO Den Haag,1384,2013,9.0, 3
4,ADO Den Haag,1451,2014,13.0, 5
I tried with the mask function, and by making a new dataframe and then merging, This worked but just seemed very sloppy. 我尝试使用掩码功能,并通过创建一个新的数据帧然后合并,这工作,但似乎非常草率。 Is there some easy way to do this?
有一些简单的方法来做到这一点?
Thanks in advance 提前致谢
Using pandas.cut
, you can define iterables for your "bins" and "labels". 使用
pandas.cut
,您可以为“bin”和“labels”定义iterables。 This is simplified by the fact they can both be defined using range
objects. 这可以通过使用
range
对象定义它们来简化。
I recommend you convert your ranking
series to int
first; 我建议你先将你的
ranking
系列转换为int
; it may be affected by floating-point rounding which may yield undesirable results. 它可能受到浮点舍入的影响,这可能会产生不良结果。
df = pd.read_csv('file.csv')
binrange = range(0, 19, 3)
labrange = range(1, 7)
df['ranking_class'] = pd.cut(df['ranking'], bins=binrange, labels=labrange)
print(df)
club_name tr_begin year ranking ranking_class
0 ADO Den Haag 1357 2010 6.0 2
1 ADO Den Haag 1480 2011 15.0 5
2 ADO Den Haag 1397 2012 9.0 3
3 ADO Den Haag 1384 2013 9.0 3
4 ADO Den Haag 1451 2014 13.0 5
I think integer division //
would do it: 我认为整数除法
//
会这样做:
df.assign(ranking_class=(df.ranking // 3).astype(int))
club_name tr_begin year ranking ranking_class
0 ADO Den Haag 1357 2010 6.0 2
1 ADO Den Haag 1480 2011 15.0 5
2 ADO Den Haag 1397 2012 9.0 3
3 ADO Den Haag 1384 2013 9.0 3
4 ADO Den Haag 1451 2014 13.0 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.