![](/img/trans.png)
[英]Update column in pandas dataframe based on another column of the same dataframe
[英]How to update a column in pandas DataFrame based on column from another DataFrame
假設我有 2 個數據幀
df1 = pd.DataFrame({'name': ['Jack', 'Lucy', 'Mark'], 'age': [1, 2, 3]})
df2 = pd.DataFrame({'name': ['Jack', 'Mark'], 'age': [10, 11], 'address': ['addr1', 'addr2']})
我應該用什么操作讓df1變成
name age address
--------------------
Jack 10 addr1
Lucy 2 NaN
Mark 11 addr2
您可以合並兩個 df 然后替換缺失值:
df_out = df1.merge(df2,on=['name'],how='left')
df_out['age'] = df_out.apply(lambda x : x['age_y'] if x['age_y']>0 else x['age_x'],axis = 1)
df_out[['name','age','address']]
輸出
| name | age | address |
|:-------|------:|:----------|
| Jack | 10 | addr1 |
| Lucy | 2 | nan |
| Mark | 11 | addr2 |
使用DataFrame.combine_first
按name
列轉換為兩個DataFrame
的索引:
df1 = df1.set_index('name')
df2 = df2.set_index('name')
df1 = df2.combine_first(df1).reset_index()
print (df1)
name address age
0 Jack addr1 10.0
1 Lucy NaN 2.0
2 Mark addr2 11.0
應更改第一個原始解決方案:
df1 = df1.set_index('name')
df2 = df2.set_index('name')
df1 = df1.reindex(df1.columns.union(df2.columns, sort=False), axis=1)
df1.update(df2)
df1 = df1.reset_index()
print (df1)
name age address
0 Jack 10.0 addr1
1 Lucy 2.0 NaN
2 Mark 11.0 addr2
或者在DataFrame.merge
和DataFrame.combine_first
使用左連接的解決方案:
#left join df2, if existing columns name is added _ to end
df = df1.merge(df2, on='name', how='left', suffixes=('','_'))
#filter columns names
new_cols = df.columns[df.columns.str.endswith('_')]
#remove last char from column names
orig_cols = new_cols.str[:-1]
#dictionary for rename
d = dict(zip(new_cols, orig_cols))
#filter columns and replace NaNs by new appended columns
df[orig_cols] = df[new_cols].rename(columns=d).combine_first(df[orig_cols])
#remove appended columns
df = df.drop(new_cols, axis=1)
print (df)
name age address
0 Jack 10.0 addr1
1 Lucy 2.0 NaN
2 Mark 11.0 addr2
您可以使用 concat、drop_duplicates、sort_index 和 reset_index
df = pd.concat([df1,df2],ignore_index=False, sort=False).drop_duplicates(["name"], keep="last").sort_index().reset_index(drop=True)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.