I'm working with two dataframes:
Dataframe1 looks like:
user (index) | apples | bananas |
---|---|---|
Pete | 4 | 2 |
Sara | 5 | 10 |
Kara | 4 | 2 |
Tom | 3 | 3 |
Dataframe2 looks like:
index | user |
---|---|
1 | Pete |
2 | Sara |
I want to create a new boolean column in dataframe1 that is true if the user is in dataframe 2. So output looks like:
user | apples | bananas | new column |
---|---|---|---|
Pete | 4 | 2 | True |
Sara | 5 | 10 | True |
Kara | 4 | 2 | False |
Tom | 3 | 3 | False |
I tried using lambda function but didn't get very far.
Here is an easy way of doing that.
df = df.reset_index()
df2['new_column']=True
df = pd.merge(df, df2, left_on='user', right_on='user', how = 'left')
df.new_column.fillna(False, inplace=True)
You can leverage the indicator
param of df.merge
. Then use df.replace
:
In [598]: x = df1.merge(df2['user'], left_on='user (index)', right_on='user', how='left', indicator='new column').replace({'both': True, 'left_only':False}).drop('user', 1)
In [599]: x
Out[599]:
user (index) apples bananas new column
0 Pete 4 2 True
1 Sara 5 10 True
2 Kara 4 2 False
3 Tom 3 3 False
OR:
For some better performance, use Series.map
instead of df.replace
:
In [609]: y = df1.merge(df2['user'], left_on='user (index)', right_on='user', how='left', indicator='new column').drop('user', 1)
In [611]: y['new column'] = y['new column'].map({'both': True, 'left_only':False})
In [612]: y
Out[612]:
user (index) apples bananas new column
0 Pete 4 2 True
1 Sara 5 10 True
2 Kara 4 2 False
3 Tom 3 3 False
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.