简体   繁体   English

更新数据框列中的信息

[英]Updating information in dataframe column

I have a filtered dataset, new_df , like this我有一个过滤数据集new_df ,像这样

    Label  New_Label    Username    Look_up
59  1.0    True         vald21      val
67  1.0    True         2512        2512
75  1.0    True         Christine   Chris

which was created to assign a new label ( New_Label ) when some conditions were met.创建它是为了在满足某些条件时分配一个新标签 ( New_Label )。 I have also another dataset ( df ) which includes all the data (this dataset is where the data above where extract from) but has not information about the New_Label (as the dataset above was created for this reason, by filtering based on specific conditions).我还有另一个数据集( df ),其中包含所有数据(该数据集是上面提取数据的地方),但没有关于New_Label信息(因为上面的数据集是为此原因创建的,通过基于特定条件进行过滤) .

        Label   Username    Look_up
    59  1.0     vald21      val
    67  1.0     2512        2512
    67  0.0     faehr6542   faehr
...
    75  1.0     Christine   Chris
   122  0.0     starogm     starogm

I would like to change the Label from my original dataset df to those rows in new_df , when it is the case Label and New_Label do not match.当 Label 和 New_Label 不匹配时,我想将Label从我的原始数据集df更改为new_df那些行。

        Label   Username    Look_up
    59  0       vald21      val
    67  0       2512        2512
    67  0       faehr6542   faehr
...
    75  0       Christine   Chris
   122  0     starogm     starogm

where True in new_df corresponds to 0 and False to 1 in Label column.其中, new_df中的True对应于Label列中的0False对应于1 I do not want to change the other values, only those ones in the new_df dataset (my key would be Username).我不想更改其他值,只更改new_df数据集中的那些值(我的键是用户名)。

Could you explain me, please, how to change information in the original dataset?请您解释一下,如何更改原始数据集中的信息?

Thanks谢谢

You can try merging two dataframe and then using .assign along with np.where .您可以尝试合并两个数据.assign ,然后使用.assignnp.where When merging with outer , the values not present will have NA so np.where with notnull() can be used:当与合并outer ,不存在于所述值将具有NA所以np.wherenotnull()可以使用:

pd.merge(df, new_df, how='outer').assign(Label = lambda row:np.where(row['New_Label'].notnull(), 0, 1))

If you do not want New_Label , you can drop the column with .drop('New_Label', axis=1) .如果您不想要New_Label ,您可以使用.drop('New_Label', axis=1)删除该列。 Something like below (if written in one line):类似于下面的内容(如果写在一行中):

pd.merge(df, new_df, how='outer').assign( Label = lambda row:  np.where(row['New_Label'].notnull(), 0, 1)).drop('New_Label', axis=1)

如果我理解您的问题,您想翻转'New_Label' ,将其转换为 int 并将其分配给'Label'

new_df['Label'] = (new_df['New_Label']==False).astype(int) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM