I have 2 dataframes, one containing 2 columns (date and key) and other containing the same 2 columns (date and key). I would like to create a new column in one dataframe with '1' value if the date and the key exists in the other dataframe, and '0' if it does not exist. Here is an example:
df1:
+---------+--------+
| date | key |
+---------+--------+
| date1 | A |
+---------+--------+
| date2 | A |
+---------+--------+
| date3 | B |
+---------+--------+
df2:
+---------+--------+
| date | key |
+---------+--------+
| date1 | A |
+---------+--------+
| date4 | C |
+---------+--------+
| date5 | B |
+---------+--------+
resulting df1:
+---------+--------+--------+
| date | key | col3 |
+---------+--------+--------+
| date1 | A | 1 |
+---------+--------+--------+
| date2 | A | 0 |
+---------+--------+--------+
| date3 | B | 0 |
+---------+--------+--------+
In this example, as the first row of df1 (date1, A) exists in df2, the value of col3 is 1, and the other rows are 0.
How can I do it?
Use indicator
parameter for new columns and then convert to 1,0
by compare both
string:
df = df1.merge(df2, how='left', indicator='col3', on=['date','key'])
df['col3'] = df['col3'].eq('both').astype(int)
Or:
df['col3'] = np.where(df['col3'].eq('both'), 1, 0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.