如何将 map 一个 dataframe 中的一列字符串值转换为另一个 dataframe 中的另一列？

Question

I have two dataframes:我有两个数据框：

Dataframe A contains the following column and values - Dataframe A 包含以下列和值 -

Column 1第 1 列
A;一个; B;乙; C C
A;一个; D; D; E乙
B;乙; C C

Dataframe B contains the following columns and values - Dataframe B 包含以下列和值 -

Column 1第 1 列	Column 2第 2 栏
A一个	Apple苹果
B乙	Banana香蕉
C C	Cat猫
D D	Dog狗
E乙	Egg蛋

Now, I want to map the values in Column 1 of Dataframe 2 to the values in Column 1 of Dataframe 1 to obtain the following column in Dataframe 1: Now, I want to map the values in Column 1 of Dataframe 2 to the values in Column 1 of Dataframe 1 to obtain the following column in Dataframe 1:

Column 1第 1 列	Derived Column派生列
A;一个; B;乙; C C	Apple;苹果; Banana;香蕉; Cat猫
A;一个; D; D; E乙	Apple;苹果; Dog;狗; Egg蛋
B;乙; C C	Banana;香蕉; Cat猫

My first thought was to iterate through each row in Dataframe 1, split the value in column 1 by ';', and then map it to Dataframe 2, but I have ~100k rows in Dataframe 1, and ~10k rows in Dataframe 2 which would make this computationally expensive. My first thought was to iterate through each row in Dataframe 1, split the value in column 1 by ';', and then map it to Dataframe 2, but I have ~100k rows in Dataframe 1, and ~10k rows in Dataframe 2 which会使这在计算上变得昂贵。 Is there a much faster way to do this?有没有更快的方法来做到这一点？ Thanks!谢谢！

Answer 1

Try with series.replace with regex=True after creating a dict from the second df:从第二个 df 创建字典后，尝试使用series.replace和regex=True ：

df1['Column 2'] = df1['Column 1'].replace(df2.set_index('Column 1')
                                           ['Column 2'],regex=True)

print(df1)

  Column 1            Column 2
0  A; B; C  Apple; Banana; Cat
1  A; D; E     Apple; Dog; Egg
2     B; C         Banana; Cat

如何将 map 一个 dataframe 中的一列字符串值转换为另一个 dataframe 中的另一列？

问题描述

1 个解决方案

解决方案1
5 已采纳 2021-03-16 04:40:27

如何将 map 一个 dataframe 中的一列字符串值转换为另一个 dataframe 中的另一列？

问题描述

1 个解决方案

解决方案1 5 已采纳 2021-03-16 04:40:27

解决方案1
5 已采纳 2021-03-16 04:40:27