一种干净有效的方法来更新Pandas DataFrames中的单元格

Question

I am looking for a cleaner way to achieve the following: 我正在寻找一种更清洁的方法来实现以下目标：

I have a DataFrame with certain columns that I want to update if new information arrives. 我有一个带有某些列的DataFrame，如果有新信息到达，我想更新这些列。 This "new information" in for of a pandas DataFrame (from a CSV file) can have more or less rows, however, I am only interested in adding 熊猫DataFrame “新信息”（来自CSV文件）可以具有更多或更少的行，但是，我只想添加

Original DataFrame 原始数据框

在此处输入图片说明

DataFrame with new information 具有新信息的DataFrame

在此处输入图片说明

(Note the missing name " c " here and the change in "status" for name " a ") （注意缺少名称“ c ”在这里和名称“在“状态”的变化a ”）

Now, I wrote the following "inconvenient" code to update the original DataFrame with the new information 现在，我编写了以下“不便”代码，用新信息更新了原始DataFrame。

Updating the "status" column based on the "name" column 根据“名称”列更新“状态”列

for idx,row in df_base.iterrows():
    if not df_upd[df_upd['name'] == row['name']].empty:
        df_base.loc[idx, 'status'] = df_upd.loc[df_upd['name'] == row['name'], 'status'].values

在此处输入图片说明

It achieves exactly what I want, but it just does neither look nice nor efficient, and I hope that there might be a cleaner way. 它完全可以达到我想要的效果，但是看起来既不好也不高效，我希望可以有一种更简洁的方法。 I tried the pd.merge method, however, the problem is that it would be adding new columns instead of "updating" the cells in that column. 我尝试了pd.merge方法，但是问题是它将添加新列而不是“更新”该列中的单元格。

pd.merge(left=df_base, right=df_upd, on=['name'], how='left')

在此处输入图片说明

I am looking forward to your tips and ideas. 我期待您的提示和想法。

Answer 1

You could set_index("name") and then call .update : 您可以set_index("name")然后调用.update ：

>>> df_base = df_base.set_index("name")
>>> df_upd = df_upd.set_index("name")
>>> df_base.update(df_upd)
>>> df_base
      status
name        
a          0
b          1
c          0
d          1

More generally, you can set the index to whatever seems appropriate, update, and then reset as needed. 通常，您可以将索引设置为任何合适的索引，然后根据需要进行更新和重置。

一种干净有效的方法来更新Pandas DataFrames中的单元格

问题描述

Original DataFrame 原始数据框

DataFrame with new information 具有新信息的DataFrame

Updating the "status" column based on the "name" column 根据“名称”列更新“状态”列

1 个解决方案

解决方案1
2 已采纳 2015-01-06 06:59:55

一种干净有效的方法来更新Pandas DataFrames中的单元格

问题描述

Original DataFrame 原始数据框

DataFrame with new information 具有新信息的DataFrame

Updating the "status" column based on the "name" column 根据“名称”列更新“状态”列

1 个解决方案

解决方案1 2 已采纳 2015-01-06 06:59:55

解决方案1
2 已采纳 2015-01-06 06:59:55