更新数据框列中的信息

Question

I have a filtered dataset, new_df , like this我有一个过滤数据集new_df ，像这样

    Label  New_Label    Username    Look_up
59  1.0    True         vald21      val
67  1.0    True         2512        2512
75  1.0    True         Christine   Chris

which was created to assign a new label ( New_Label ) when some conditions were met.创建它是为了在满足某些条件时分配一个新标签 ( New_Label )。 I have also another dataset ( df ) which includes all the data (this dataset is where the data above where extract from) but has not information about the New_Label (as the dataset above was created for this reason, by filtering based on specific conditions).我还有另一个数据集（ df ），其中包含所有数据（该数据集是上面提取数据的地方），但没有关于New_Label信息（因为上面的数据集是为此原因创建的，通过基于特定条件进行过滤） .

        Label   Username    Look_up
    59  1.0     vald21      val
    67  1.0     2512        2512
    67  0.0     faehr6542   faehr
...
    75  1.0     Christine   Chris
   122  0.0     starogm     starogm

I would like to change the Label from my original dataset df to those rows in new_df , when it is the case Label and New_Label do not match.当 Label 和 New_Label 不匹配时，我想将Label从我的原始数据集df更改为new_df那些行。

        Label   Username    Look_up
    59  0       vald21      val
    67  0       2512        2512
    67  0       faehr6542   faehr
...
    75  0       Christine   Chris
   122  0     starogm     starogm

where True in new_df corresponds to 0 and False to 1 in Label column.其中， new_df中的True对应于Label列中的0和False对应于1 。 I do not want to change the other values, only those ones in the new_df dataset (my key would be Username).我不想更改其他值，只更改new_df数据集中的那些值（我的键是用户名）。

Could you explain me, please, how to change information in the original dataset?请您解释一下，如何更改原始数据集中的信息？

Thanks谢谢

Answer 1

You can try merging two dataframe and then using .assign along with np.where .您可以尝试合并两个数据.assign ，然后使用.assign和np.where 。 When merging with outer , the values not present will have NA so np.where with notnull() can be used:当与合并outer ，不存在于所述值将具有NA所以np.where与notnull()可以使用：

pd.merge(df, new_df, how='outer').assign(Label = lambda row:np.where(row['New_Label'].notnull(), 0, 1))

If you do not want New_Label , you can drop the column with .drop('New_Label', axis=1) .如果您不想要New_Label ，您可以使用.drop('New_Label', axis=1)删除该列。 Something like below (if written in one line):类似于下面的内容（如果写在一行中）：

pd.merge(df, new_df, how='outer').assign( Label = lambda row:  np.where(row['New_Label'].notnull(), 0, 1)).drop('New_Label', axis=1)

Answer 2

如果我理解您的问题，您想翻转'New_Label' ，将其转换为 int 并将其分配给'Label' ：

new_df['Label'] = (new_df['New_Label']==False).astype(int)

更新数据框列中的信息

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-08-30 18:59:00

解决方案2
0 2020-08-30 19:00:41

更新数据框列中的信息

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-08-30 18:59:00

解决方案2 0 2020-08-30 19:00:41

解决方案1
1 已采纳 2020-08-30 18:59:00

解决方案2
0 2020-08-30 19:00:41