[英]How set value in column according to another dataframe column's value
I have two DataFrames:我有两个数据框:
First: df1第一个: df1
df1 = {'NAME': ['A','B','C','D'],
'GROUP': ['A1','B1','C1','D1']
}
df1 = pd.DataFrame(df1,columns=['NAME','GROUP'])
NAME GROUP
0 A A1
1 B B1
2 C C1
3 D D1
Second: df2第二: df2
df2 = {'NAME': ['AA','AAA','AAAA','BB','BBB','BBBB','CC','CCC','CCCC','DD','DDD','DDDD'],
'GROUP': ['','','','','','','','','','','','']
}
df2 = pd.DataFrame(df2,columns=['NAME','GROUP'])
NAME GROUP
0 AA
1 AAA
2 AAAA
3 BB
4 BBB
5 BBBB
6 CC
7 CCC
8 CCCC
9 DD
10 DDD
11 DDDD
My task is set GROUP in df2 according the NAME in df1.我的任务是根据 df1 中的 NAME 在 df2 中设置 GROUP。
I think I need to use contains : IF df1['NAME'] is in df2['NAME'] set GROUP to that in df1['NAME].我想我需要使用包含:如果 df1['NAME'] 在 df2['NAME'] 中,将 GROUP 设置为 df1['NAME] 中的那个。 I tried to use a loop and convert the DataFrame into arrays , but it didn't help.
我尝试使用循环并将 DataFrame 转换为arrays ,但没有帮助。
Use Series.str.extract
to create the matching column you can merge on.使用
Series.str.extract
创建可以合并的匹配列。 Then bring the group over.然后把小组带过来。 Remove the
'GROUP'
column that already exists before the merge, and I left the 'match'
column in for clarity.删除合并前已经存在的
'GROUP'
列,为了清楚起见,我保留了'match'
列。
In the case of multiple substring matches, because this uses .str.extract
it will merge with only the first substring match.在多个 substring 匹配的情况下,因为它使用
.str.extract
它将仅与第一个 substring 匹配合并。 (Multple matches can be handled with .str.extractall
and some groupby to combine everything into, say, a list.) (可以使用
.str.extractall
和一些 groupby 来处理多个匹配项,以将所有内容组合成一个列表。)
pat = '(' + '|'.join(df1['NAME']) +')'
df2['match'] = df2['NAME'].str.extract(pat)
df2 = df2.drop(columns='GROUP').merge(df1.rename(columns={'NAME': 'match'}), how='left')
print(df2)
NAME match GROUP
0 AA A A1
1 AAA A A1
2 AAAA A A1
3 BB B B1
4 BBB B B1
5 BBBB B B1
6 CC C C1
7 CCC C C1
8 CCCC C C1
9 DD D D1
10 DDD D D1
11 DDDD D D1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.