![](/img/trans.png)
[英]NaN values when new column is added to pandas DataFrame based on an existing column data
[英]Adding the new column based on existing column in dataframe giving NaN values
我想基于数据框的现有列添加列。 框架包含5列。 我需要用数值替换类别列。 基于此,我需要添加'Class'
列并根据上述条件分配值0或1。
Desired result:
File Task Category Class
0 g0pA_taska.txt a 0 0
1 g0pA_taskb.txt b 3 1
2 g0pA_taskc.txt c 2 1
3 g0pA_taskd.txt d 1 1
4 g0pA_taske.txt e 0 0
...
...
99 orig_taske.txt e -1 -1
plagiarism_df.replace({'Category' : {'non':0,'heavy':1,'light':2,'cut':3,'orig':-1}})
plagiarism_df.loc[plagiarism_df['Category']==0, 'Class'] = 0
plagiarism_df.loc[plagiarism_df['Category']==1, 'Class'] = 1
plagiarism_df.loc[plagiarism_df['Category']==2, 'Class'] = 1
plagiarism_df.loc[plagiarism_df['Category']==3, 'Class'] = 1
plagiarism_df.loc[plagiarism_df['Category']==-1,'Class'] = 1
您没有修改DataFrame, replace
返回一个新的DataFrame,您必须为其分配: plagiarism_df = plagiarism_df.replace({'Category': { 'non': 0, 'heavy': 1, 'light': 2, 'cut': 3, 'orig': -1 }})
或使用que param inplace inplace = True
来修改DataFrame对象,如下所示:
plagiarism_df.replace({'Category':{ 'non': 0, 'heavy': 1, 'light': 2, 'cut': 3, 'orig': -1}}, inplace=True)
或者,您可以使用地图函数,然后应用lambda以获得所需的结果: plagiarism_df['Category'] = plagiarism_df['Category'].map({ 'non': 0, 'heavy': 1, 'light': 2, 'cut': 3, 'orig': -1})
plagiarism_df['Class'] = plagiarism_df['Category'].apply(lambda x: 1 if x in [1,2,3,-1] else 0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.