根据条件更改值列

Question

I have a dataframe df with a column that has values ranged between 0 and 1. 我有一个数据框df，其列的值介于0到1之间。

I would like to change the values from numerical to ordinal as follows: 我想将数值从数字更改为序数，如下所示：

'0-20'  for x <= 0.2
'20-40'  for 0.2 < x <= 0.4
'40-60'  for 0.4 < x <= 0.6
'60-80'  for 0.6 < x <= 0.8
'80-100'  for 0.8 < x <= 1
 I've passed X['Probability'].loc[X['Probability'] <= 0.2] = '0-20'

But on next one I get an error saying: 但是在下一个我得到一个错误说：

TypeError: unorderable types: str() > float(). TypeError：不可排序的类型：str（）> float（）。

How to get pass this ? 如何获得通过？ Thanks ! 谢谢！

Answer 1

You can use cut : 您可以使用cut ：

bins = [-np.inf, .2, .4, .6, .8, 1]
labels = ["{0} - {1}".format(i, i + 20) for i in range(0, 100, 20)]
#same as
#labels=['0-20','20-40','40-60','60-80','80-100']

df['label'] = pd.cut(df['Probability'], bins=bins, labels=labels)

Sample: 样品：

np.random.seed(100)
df = pd.DataFrame(np.random.random((10,1)), columns=['Probability'])
df.loc[0, 'Probability'] = 0
df.loc[8, 'Probability'] = 0.4
df.loc[9, 'Probability'] = 1

bins = [-np.inf, .2, .4, .6, .8, 1]
labels = ["{0} - {1}".format(i, i + 20) for i in range(0, 100, 20)]
df['label'] = pd.cut(df['Probability'], bins=bins, labels=labels)
print (df)
   Probability   label
0     0.000000    0-20
1     0.278369   20-40
2     0.424518   40-60
3     0.844776  80-100
4     0.004719    0-20
5     0.121569    0-20
6     0.670749   60-80
7     0.825853  80-100
8     0.400000   20-40
9     1.000000  80-100

根据条件更改值列

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-03-30 14:25:40

根据条件更改值列

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-03-30 14:25:40

解决方案1
2 已采纳 2017-03-30 14:25:40