I have tow data frames A and B, and I want to match between names columns in tow data frames if the name is existing in data set BI need to create a new column in data set A with the Id of data set B if not existing return 0
here is the code I wrote
#data B
email name id
hi@amal.com amal call 6
hi@hotmail.com amal 6
hi@gmail.com AMAL boy 6
hi@boy.com boy 7
hi@hotmail.com boy 7
hi@call.com call AMAL 9
hi@hotmail.com boy 7
hi@dog.com dog 8
hi@outlook.com dog 8
hi@gmail.com dog 8
#data A
id name
1 amal
1 AMAL
2 call
4 dog
3 boy
first I create contains function
A.name.str.contains('|'.join(B.name))
then I tried to create a column
A["new"] = np.where(A.name.str.contains('|'.join(B.name))==True, B.id, 0)
but I get this error
ValueError: operands could not be broadcast together with shapes (5,) (10,) ()
what I expected is
id name new
1 amal 6
1 AMAL 0
2 call 0
4 dog 7
3 boy 8
any help?
Use Series.map
by Series with removed duplicated rows by DataFrame.drop_duplicates
, then replace missing values by Series.fillna
and convert to integers:
A["new"] = A.name.map(B.drop_duplicates('name').set_index('name')['id']).fillna(0).astype(int)
print (A)
id name new
0 1 amal 6
1 1 AMAL 0
2 2 call 0
3 4 dog 8
4 3 boy 7
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.