[英]How do I map a column of string values in one dataframe to another column in another dataframe?
I have two dataframes:我有两个数据框:
Dataframe A contains the following column and values - Dataframe A 包含以下列和值 -
Column 1![]() |
---|
A;![]() ![]() ![]() |
A;![]() ![]() ![]() |
B;![]() ![]() |
Dataframe B contains the following columns and values - Dataframe B 包含以下列和值 -
Column 1![]() |
Column 2![]() |
---|---|
A![]() |
Apple![]() |
B![]() |
Banana![]() |
C ![]() |
Cat![]() |
D ![]() |
Dog![]() |
E![]() |
Egg![]() |
Now, I want to map the values in Column 1 of Dataframe 2 to the values in Column 1 of Dataframe 1 to obtain the following column in Dataframe 1: Now, I want to map the values in Column 1 of Dataframe 2 to the values in Column 1 of Dataframe 1 to obtain the following column in Dataframe 1:
Column 1![]() |
Derived Column![]() |
---|---|
A;![]() ![]() ![]() |
Apple;![]() ![]() ![]() |
A;![]() ![]() ![]() |
Apple;![]() ![]() ![]() |
B;![]() ![]() |
Banana;![]() ![]() |
My first thought was to iterate through each row in Dataframe 1, split the value in column 1 by ';', and then map it to Dataframe 2, but I have ~100k rows in Dataframe 1, and ~10k rows in Dataframe 2 which would make this computationally expensive. My first thought was to iterate through each row in Dataframe 1, split the value in column 1 by ';', and then map it to Dataframe 2, but I have ~100k rows in Dataframe 1, and ~10k rows in Dataframe 2 which会使这在计算上变得昂贵。 Is there a much faster way to do this?
有没有更快的方法来做到这一点? Thanks!
谢谢!
Try with series.replace
with regex=True
after creating a dict from the second df:从第二个 df 创建字典后,尝试使用
series.replace
和regex=True
:
df1['Column 2'] = df1['Column 1'].replace(df2.set_index('Column 1')
['Column 2'],regex=True)
print(df1)
Column 1 Column 2
0 A; B; C Apple; Banana; Cat
1 A; D; E Apple; Dog; Egg
2 B; C Banana; Cat
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.