I have a large dataframe that has two columns but with many rows, so this is just an example:
df1 = {"text":["see you in five minutes.", "she is my friend.", "she goes to school in five minutes.","he is my friend.","that is right.","sky is blue.","sky is yellow."],
"goal":[" "," "," "," "," "," "," "]}
and I also have three other dataframes that are of different sizes but they all have some rows from the text column in df1:
df2= {"text":["see you in five minutes.", "he is my friend."],
"second":["num","friend"]}
df3 = {"text":["she goes to school in five minutes.","she is my friend.","that is right."],
"third":["num","friend","correct"]}
df4 = {"text":["sky is blue.","sky is yellow."],
"fourth":["color","color"]}
what I would like to do is to merge the 'second','third' and 'fourth' column to df1 in a way that they would populate the empty column 'goal' in df1,
desired output:
df1 = {"text":["see you in five minutes.", "she is my friend.", "she goes to school in five minutes.","he is my friend.","that is right.","sky is blue.","sky is yellow."],
"goal":["num","friend","num","friend","correct","color","color"]}
I tried left merge multiple times for each dataframe but the output will appear in a different column. Is there a way of doing it at once and add them to the goal column?
thank you
Create a mapping series m
by concatenating the dataframes df2
, df3
, and df4
using pd.concat
, then use this mapping series along with Series.map
to map the text column in df1
:
m = pd.concat([df.set_index('text').iloc[:, 0] for df in (df2, df3, df4)])
df1['goal'] = df1['text'].map(m)
Result:
# print(df1)
text goal
0 see you in five minutes. num
1 she is my friend. friend
2 she goes to school in five minutes. num
3 he is my friend. friend
4 that is right. correct
5 sky is blue. color
6 sky is yellow. color
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.