[英]Pandas: how to merge to dataframes on multiple columns?
I have 2 dataframes, df1
and df2
. 我有2个数据帧,
df1
和df2
。
df1
Contains the information of some interactions between people. df1
包含人与人之间某些交互的信息。
df1
Name1 Name2
0 Jack John
1 Sarah Jack
2 Sarah Eva
3 Eva Tom
4 Eva John
df2
Contains the status of general people and also some people in df1
df2
包含普通人的状态以及df1
某些人的状态
df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Laura 0
I would like df2
only for the people that are in df1
(Laura disappears), and for those that are not in df2
keep NaN
(ie Eva) such as: 我想
df2
仅适用于df1
(Laura消失),而对于那些不在df2
请保留NaN
(即Eva),例如:
df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Eva NaN
Create a DataFrame
on unique values of df1
and map
it with df2
as: 在
df1
唯一值上创建一个DataFrame
,并将其与df2
map
为:
df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
df['Y'] = df.Name.map(df2.set_index('Name')['Y'])
print(df)
Name Y
0 Eva NaN
1 Jack 0.0
2 John 1.0
3 Sarah 0.0
4 Tom 1.0
Note : Order is not preserved. 注意:订单不会保留。
You can create a list of unique names in df1 and use isin 您可以在df1中创建唯一名称列表并使用isin
names = np.unique(df1[['Name1', 'Name2']].values.ravel())
df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan
Name Y
0 Jack 0.0
1 John 1.0
2 Sarah 0.0
3 Tom 1.0
4 Laura NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.