简体   繁体   中英

Map two dataframes in a case insensitive way (Python pandas)

I have two dataframe of different lenghts.

db
index| Size   | GROUP FORMAT
1    | AA     | Unknown
2    | BB     | Unknown
3    | CC     | Unknown

db2
index| GROUP FORMAT| FORMAT
1    | G1          | Aa
2    | G2          | bB

The column FORMAT of db2 and Size of db have the same letters, but different upper/lower cases may happen. I want to map them in order to get:

db
index| Size   | GROUP FORMAT
1    | AA     | G1
2    | BB     | G2
3    | CC     | Unknown

However, if possible, I'd rather not duplicate and drop any column. Is it possible to map the two dataframes in a case insensitive way?

Try to convert all to uppercase, then merge:

df1['GROUP FORMAT' ] = (df1.merge(df2.assign(FORMAT=df2.FORMAT.str.upper()), 
                                 left_on='Size', right_on='FORMAT', how='left')
                           ['GROUP FORMAT_y']
                           .fillna(df1['GROUP FORMAT'])
                       )

Output:

   index Size GROUP FORMAT
0      1   AA           G1
1      2   BB           G2
2      3   CC      Unknown

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM