[英]Change value column based on condition
I have a dataframe df with a column that has values ranged between 0 and 1. 我有一个数据框df,其列的值介于0到1之间。
I would like to change the values from numerical to ordinal as follows: 我想将数值从数字更改为序数,如下所示:
'0-20' for x <= 0.2
'20-40' for 0.2 < x <= 0.4
'40-60' for 0.4 < x <= 0.6
'60-80' for 0.6 < x <= 0.8
'80-100' for 0.8 < x <= 1
I've passed X['Probability'].loc[X['Probability'] <= 0.2] = '0-20'
But on next one I get an error saying: 但是在下一个我得到一个错误说:
TypeError: unorderable types: str() > float(). TypeError:不可排序的类型:str()> float()。
How to get pass this ? 如何获得通过? Thanks ! 谢谢 !
bins = [-np.inf, .2, .4, .6, .8, 1]
labels = ["{0} - {1}".format(i, i + 20) for i in range(0, 100, 20)]
#same as
#labels=['0-20','20-40','40-60','60-80','80-100']
df['label'] = pd.cut(df['Probability'], bins=bins, labels=labels)
Sample: 样品:
np.random.seed(100)
df = pd.DataFrame(np.random.random((10,1)), columns=['Probability'])
df.loc[0, 'Probability'] = 0
df.loc[8, 'Probability'] = 0.4
df.loc[9, 'Probability'] = 1
bins = [-np.inf, .2, .4, .6, .8, 1]
labels = ["{0} - {1}".format(i, i + 20) for i in range(0, 100, 20)]
df['label'] = pd.cut(df['Probability'], bins=bins, labels=labels)
print (df)
Probability label
0 0.000000 0-20
1 0.278369 20-40
2 0.424518 40-60
3 0.844776 80-100
4 0.004719 0-20
5 0.121569 0-20
6 0.670749 60-80
7 0.825853 80-100
8 0.400000 20-40
9 1.000000 80-100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.