简体   繁体   English

用其他数据框熊猫中的相应单词替换数据框中的字符串行

[英]Replace rows of strings in dataframe with corresponding words in other dataframe pandas

I have a df which has 1 column 我有一个1列的df

     List
 0   What are you trying to achieve
 1   What is your purpose right here
 2   When students don’t have a proper foundation
 3   I am going to DESCRIBE a sunset

I have other dataframe df2 我还有其他数据框df2

which has 2 columns 有两列

    original       correct
0     are          were
1     sunset       sunrise
2     I            we
3     right        correct
4     is           was

I want to replace such words in my df,which occurs in original column of my df2 and replace with corresponding words in correct column. 我想在df2的original列中替换df中的此类单词,并在correct列中替换为相应的单词。 and store the new strings in other dataframe df_new 并将新字符串存储在其他数据帧df_new

Is it possible without using loops and iteration, and only using plain pandas concept? 是否可以不使用循环和迭代,而只能使用普通的熊猫概念?

ie my df_new should contain. 即我的df_new应该包含。

     List
 0   What were you trying to achieve
 1   What was your purpose correct here
 2   When students don’t have a proper foundation
 3   we am going to DESCRIBE a sunrise

Also this is just a test example, MY df MIGHT CONTAIN millions of rows of string, and so my df2, What would be the most efficient solution path i can go on? 同样,这只是一个测试示例,我的df数百万行的字符串,所以我的df2,我可以采用的最有效的解决方案是什么?

One of many possible solutions: 许多可能的解决方案之一:

In [371]: boundary = r'\b'
     ...:
     ...: df.List.replace((boundary + df2.orignal + boundary).values.tolist(),
     ...:                 df2.correct.values.tolist(),
     ...:                 regex=True)
     ...:
Out[371]:
0                  What were you trying to achieve
1               What was your purpose correct here
2     When students don’t have a proper foundation
3                we am going to DESCRIBE a sunrise
Name: List, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM