繁体   English   中英

在pandas数据框中,我想基于将其他列过滤为某些值来为该列分配值

[英]In a pandas dataframe I would like to assign a value to a column based on filtering other columns to certain values

例如,我想将“ ModelPrediction”列中的所有值更改为1,其中“ AgeGrp”列等于[0,5],“性别”列等于male,而“ PClass”列等于“ 1”和“ 2”。

我已经将AgeGrp,Pclass列的数据类型更改为对象。

表

我的尝试如下:

train.loc[train['Sex'] == 'male' & ['Pclass'] == 1 & ['Pclass'] == 2 & ['AgeGrp'] == (0, 5], 'ModelPrediction'] = 1  

我对python / pandas的所有事物都是新手,感谢您的帮助!! 谢谢!

我想你需要添加()Interval ,也有两倍的条件与Pclass ,我觉得需要isin这里如果需要检查这两个值:

train = pd.DataFrame({'Sex':['male','female','male'],
                      'Pclass':[1,0,1],
                      'AgeGrp':[pd.Interval(0, 5, closed='right'),
                                pd.Interval(6, 10, closed='right'),
                                pd.Interval(0, 5, closed='right')],
                        'ModelPrediction':[0,1,0]})
print (train)
      Sex  Pclass   AgeGrp  ModelPrediction
0    male       1   (0, 5]                0
1  female       0  (6, 10]                1
2    male       1   (0, 5]                0

train.loc[(train['Sex'] == 'male') & 
          (train['Pclass'].isin([1, 2])) & 
          (train['AgeGrp'] == pd.Interval(0, 5, closed='right')), 'ModelPrediction'] = 1  

print (train)
      Sex  Pclass   AgeGrp  ModelPrediction
0    male       1   (0, 5]                1
1  female       0  (6, 10]                1
2    male       1   (0, 5]                1

您非常接近,但是您的条件之一( Pclass既是1又是2)是不可能的,间隔的语法不存在,并且您希望圆括号分隔每个条件:

train.loc[(train['Sex'] == 'male') & ((train['Pclass'] == 1) | (train['Pclass'] == 2)) & (train['AgeGrp'] > 0) & (train['AgeGrp'] <= 5), 'ModelPrediction'] = 1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM