简体   繁体   中英

How can I create new dataframe column with values based on condition of 2 columns in python?

I have 2 dataframes, one containing 2 columns (date and key) and other containing the same 2 columns (date and key). I would like to create a new column in one dataframe with '1' value if the date and the key exists in the other dataframe, and '0' if it does not exist. Here is an example:

df1:
+---------+--------+
|  date   |  key   |
+---------+--------+
|  date1  |    A   |
+---------+--------+
|  date2  |    A   |
+---------+--------+
|  date3  |    B   |
+---------+--------+


df2:
+---------+--------+
|  date   |  key   |
+---------+--------+
|  date1  |    A   |
+---------+--------+
|  date4  |    C   |
+---------+--------+
|  date5  |    B   |
+---------+--------+


resulting df1:

+---------+--------+--------+
|  date   |  key   |  col3  |
+---------+--------+--------+
|  date1  |    A   |   1    |
+---------+--------+--------+
|  date2  |    A   |   0    |
+---------+--------+--------+
|  date3  |    B   |   0    |
+---------+--------+--------+


In this example, as the first row of df1 (date1, A) exists in df2, the value of col3 is 1, and the other rows are 0.

How can I do it?

Use indicator parameter for new columns and then convert to 1,0 by compare both string:

df = df1.merge(df2, how='left', indicator='col3', on=['date','key'])
df['col3'] = df['col3'].eq('both').astype(int)

Or:

df['col3'] = np.where(df['col3'].eq('both'), 1, 0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM