[英]How to replace one pandas dataframe column values based on some other dataframe?
I have two dataframes.我有两个数据框。
df1
and df2
. df1
和df2
。 This is the content of df1
这是
df1
的内容
col1 col2 col3
0 1 12 100
1 2 34 200
2 3 56 300
3 4 78 400
This is the content of df2
这是
df2
的内容
col1 col2 col3
0 2 1984 500
1 3 4891 600
I want to have this final data frame:我想要这个最终的数据框:
col1 col2 col3
0 1 12 100
1 2 1984 200
2 3 4891 300
3 4 78 400
Note that col1
is the primary key in df1
and df2
.请注意,
col1
是df1
和df2
中的主键。 I tried to do it via mapping values, but I could not make it work.我试图通过映射值来做到这一点,但我无法让它工作。
Here is an MCVE for checking those data frames easily:这是一个用于轻松检查这些数据帧的 MCVE:
import pandas as pd
d = {'col1': ['1', '2','3','4'], 'col2': [12, 34,56,78],'col3':[100,200,300,400]}
df1 = pd.DataFrame(data=d)
d = {'col1': ['2','3'], 'col2': [1984,4891],'col3':[500,600]}
df2 = pd.DataFrame(data=d)
print(df1)
print(df2)
d = {'col1': ['1', '2','3','4'], 'col2': [12, 1984,4891,78],'col3':[100,200,300,400]}
df_final = pd.DataFrame(data=d)
print(df_final)
You can map
and fillna
:您可以
map
和fillna
:
df1['col2'] = (df1['col1']
.map(df2.set_index('col1')['col2'])
.fillna(df1['col2'], downcast='infer')
)
output: output:
col1 col2 col3
0 1 12 100
1 2 1984 200
2 3 4891 300
3 4 78 400
If col1
is unique, combine_first
is an option, too:如果
col1
是唯一的, combine_first
也是一个选项:
>>> (df2.drop("col3", axis=1)
.set_index("col1")
.combine_first(df1.set_index("col1"))
.reset_index()
)
col1 col2 col3
0 1 12 100
1 2 1984 200
2 3 4891 300
3 4 78 400
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.