I would like to create a join using a composite ID (car, ID) and if both match on the first df the use the test column value to create a new column
# Import pandas library
import pandas as pd
# initialize list of lists
data1 = [['ford', 1010], ['chevy', 1515], ['toyota', 1515]]
# Create the pandas DataFrame
df_1 = pd.DataFrame(data1, columns = ['Car', 'ID'])
data2 = [['ford', 1010, 'sat'], ['chevy', 1515, 'unsat'], ['toyota', 1515, 'sat']]
# Create the pandas DataFrame
df_2 = pd.DataFrame(data2, columns = ['Car', 'ID', 'Test'])
I currently use merge for single column joins. But this yields incorrect info since different cars can have same IDs for test.
df_1_2 = pd.merge(df_1, df_2, on ='ID', how='left')
print(df_1_2)
While the answer I am looking for is some thing like this:
Just do a merge without the on= keyword, like below:
df_1_2 = pd.merge(df_1, df_2)
# Car ID Test
#0 ford 1010 sat
#1 chevy 1515 unsat
#2 toyota 1515 sat
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.