简体   繁体   中英

Pandas Merge and Sum Data Frames

I have the following data frames:

Data Frame 1:

ID1,ID2,VAL1,VAL2
CAR,RED,5,5
TRUCK,RED,6,6
CAR,BLUE,1,1

Data Frame 2:

ID1,ID2,VAL1,VAL2
BIKE,RED,5,5
TRUCK,BLACK,6,6
CAR,RED,1,1

I want to left join these two data frames on the key = {ID1, ID2}. But I also want to sum the values {VAL1, VAL2}. So for instance, the output would be:

ID1,ID2,VAL1,VAL2
CAR,RED,6,6
TRUCK,RED,6,6
CAR,BLUE,1,1

I've tried all sorts of combinations of pandas.merge and have had no luck. Can someone help me out?

To join dataframes in pandas use pd.merge . In the given case join is applied on columns with similar names, thus it's enough to pass the list of those column names as on parameter:

merged = pd.merge(df_1, df_2, on=["ID1", "ID2"], how="left").fillna(0)

Next, calculate necessary columns using, for example, df.assign :

merged = merged.assign(
    VAL1 = lambda x: x.VAL1_x + x.VAL1_y,
    VAL2 = lambda x: x.VAL2_x + x.VAL2_y)

Result :

columns = df_1.columns 
merged[columns]

>>> ID1     ID2     VAL1    VAL2
0   CAR     RED     6.0     6.0
1   TRUCK   RED     6.0     6.0
2   CAR     BLUE    1.0     1.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM