简体   繁体   English

仅当列没有值时,Pandas DataFrame 从另一个 dataframe 更新

[英]Pandas DataFrame update from another dataframe only if a column does not have a value

I am trying to update a dataframe using the values in another dataframe but I would like the update to happen only if a particular column does not have a value.我正在尝试使用另一个 dataframe 中的值更新 dataframe 但我希望仅在特定列没有值时才进行更新。

from datetime import datetime
import pandas as pd

dr = pd.bdate_range(periods=3, end=datetime.now().date())

df1 = pd.DataFrame([1, 2], columns=['myid'])
for d in dr:                                                                                                                                   
    df1[d.to_pydatetime()] = pd.np.nan
df1.loc[df1['myid'] == 1, dr[2]] = 4.0
df1 = df1.set_index('myid')


df1
      2019-11-13 00:00:00  2019-11-14 00:00:00  2019-11-15 00:00:00
myid                                                               
1                     NaN                  NaN                  4.0
2                     NaN                  NaN                  NaN

df2 = pd.DataFrame([1, 2], columns=['myid'])
for d in dr:                                                                                                                                   
    df2[d.to_pydatetime()] = pd.np.nan
df2.loc[df2['myid'] == 2, dr[2]] = 4.0
df2.loc[df2['myid'] == 1, dr[0]] = 6.0
df2 = df2.set_index('myid')

df2
      2019-11-13 00:00:00  2019-11-14 00:00:00  2019-11-15 00:00:00
myid                                                               
1                     6.0                  NaN                  NaN
2                     NaN                  NaN                  4.0

I would like to update df1 with values in df2 if df1 does not have a value for dr[2] (current date)如果 df1 没有dr[2] (current date) ,我想用 df2 中的值更新 df1

So in the above example only the second row in df1 should get updated.所以在上面的例子中,只有 df1 中的第二行应该被更新。

I tried update as follows but not sure how to filter based on whether the column has a value or not我尝试如下update ,但不确定如何根据列是否有值进行过滤

df1.update(df2, overwrite=False)

I did look at filter_func that update takes but again unable to make this work with it.我确实查看了更新所需的filter_func ,但再次无法使用它。 Any help is much appreciated.任何帮助深表感谢。 Thanks谢谢

EDIT:编辑:

Expected output:预期 output:

Row 1 should not be touched because it already has a value in column 2019-11-15 00:00:00不应触摸第 1 行,因为它在2019-11-15 00:00:00列中已有值

df1
      2019-11-13 00:00:00  2019-11-14 00:00:00  2019-11-15 00:00:00
myid                                                               
1                     NaN                  NaN                  4.0
2                     NaN                  NaN                  4.0

Update: This seems to be an obvious use for the filter_func argument.更新:这似乎是filter_func参数的明显用途。 Update only rows where all columns of df1 are null:仅更新df1的所有列都是 null 的行:

df1.update(df2, filter_func=lambda df: df1.isnull().all(1))
#      2019-11-13 00:00:00  2019-11-14 00:00:00  2019-11-15 00:00:00
#myid                                                               
#1                     NaN                  NaN                  4.0
#2                     NaN                  NaN                  4.0

Old answer, more hands-on:旧答案,更多动手:

You can separate which rows to update, update only those rows then combine.您可以分离要更新的行,仅更新这些行然后合并。 update operates inplace so we need to split things out. update就地运行,所以我们需要把事情分开。

m = df1.notnull().any(1)

# These get updated
u = df1[~m].copy()
u.update(df2)

df1 = pd.concat([df1[m], u])
#      2019-11-13 00:00:00  2019-11-14 00:00:00  2019-11-15 00:00:00
#myid                                                               
#1                     NaN                  NaN                  4.0
#2                     NaN                  NaN                  4.0

Alternatively, you could use combine_first , then mask rows that shouldn't have been updated and reset them back to the original values in df1或者,您可以使用combine_first ,然后屏蔽不应该更新的行并将它们重置回df1中的原始值

df1.combine_first(df2).mask(df1.notnull().any(1)).fillna(df1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用来自另一个具有条件的数据帧的值更新熊猫数据帧列 - update pandas dataframe column with value from another dataframe with condition 熊猫列值从另一个数据框值更新 - pandas column value update from another dataframe value 如何查找两个 pandas dataframe 并从另一个 dataframe 列更新第一个 dataframe 列中的值? - how to do lookup on two pandas dataframe and update its value in first dataframe column from another dataframe column? 如何根据另一个 DataFrame 中的列更新 Pandas DataFrame 中的列 - How to update a column in pandas DataFrame based on column from another DataFrame 从另一个数据帧中的列值替换pandas数据帧中的列中的值 - Replacing value in a column in pandas dataframe from a column value in another dataframe Python 在索引值上仅更新来自另一个 dataframe 的 1 列 - Python Update only 1 column from another dataframe on index value 从Pandas中不同数据框中的另一个匹配列更新数据框中的列值 - update a column value in a dataframe from another matching column in different dataframe in Pandas Python Pandas:仅当列值唯一时,才将数据框追加到另一个数据框 - Python Pandas: Append Dataframe To Another Dataframe Only If Column Value is Unique 当来自熊猫中另一个数据框的键匹配时更新数据框的列 - Update column of a dataframe when key matches from another dataframe in pandas pandas 在列值匹配时使用来自另一个数据帧的值更新数据帧 - pandas update a dataframe with values from another dataframe on the match of column values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM