How set value in column according to another dataframe column's value

Question

I have two DataFrames:

First: df1

df1 = {'NAME': ['A','B','C','D'],
        'GROUP': ['A1','B1','C1','D1']
        }
df1 = pd.DataFrame(df1,columns=['NAME','GROUP'])

   NAME GROUP
0   A   A1
1   B   B1
2   C   C1
3   D   D1

Second: df2

df2 = {'NAME': ['AA','AAA','AAAA','BB','BBB','BBBB','CC','CCC','CCCC','DD','DDD','DDDD'],
        'GROUP': ['','','','','','','','','','','','']
        }
df2 = pd.DataFrame(df2,columns=['NAME','GROUP'])

    NAME    GROUP
0   AA  
1   AAA 
2   AAAA    
3   BB  
4   BBB 
5   BBBB    
6   CC  
7   CCC 
8   CCCC    
9   DD  
10  DDD 
11  DDDD

My task is set GROUP in df2 according the NAME in df1.

I think I need to use contains : IF df1['NAME'] is in df2['NAME'] set GROUP to that in df1['NAME]. I tried to use a loop and convert the DataFrame into arrays , but it didn't help.

Answer 1

Use Series.str.extract to create the matching column you can merge on. Then bring the group over. Remove the 'GROUP' column that already exists before the merge, and I left the 'match' column in for clarity.

In the case of multiple substring matches, because this uses .str.extract it will merge with only the first substring match. (Multple matches can be handled with .str.extractall and some groupby to combine everything into, say, a list.)

pat = '(' + '|'.join(df1['NAME']) +')'
df2['match'] = df2['NAME'].str.extract(pat)

df2 = df2.drop(columns='GROUP').merge(df1.rename(columns={'NAME': 'match'}), how='left')

print(df2)

    NAME match GROUP
0     AA     A    A1
1    AAA     A    A1
2   AAAA     A    A1
3     BB     B    B1
4    BBB     B    B1
5   BBBB     B    B1
6     CC     C    C1
7    CCC     C    C1
8   CCCC     C    C1
9     DD     D    D1
10   DDD     D    D1
11  DDDD     D    D1

How set value in column according to another dataframe column's value

Question

1 answers

solution1
2 2021-01-04 19:50:09

How set value in column according to another dataframe column's value

Question

1 answers

solution1 2 2021-01-04 19:50:09

solution1
2 2021-01-04 19:50:09