简体   繁体   English

大熊猫:替换字符串不会替换目标子字符串

[英]pandas: Replace string is not replacing targeted substring

I am trying to iterate a list of strings using dataframe1 to check whether the other dataframe2 has any strings found in dataframe1 to replace them. 我正在尝试使用dataframe1迭代字符串列表,以检查其他dataframe2是否在dataframe1中找到任何字符串来替换它们。

for index, row in nlp_df.iterrows():
    print( row['x1'] )
    string1 = row['x1'].replace("(","\(")
    string1 = string1.replace(")","\)")
    string1 = string1.replace("[","\[")
    string1 = string1.replace("]","\]")
    nlp2_df['title'] = nlp2_df['title'].replace(string1,"")

In order to do this I iterated using the code shown above to check and replace for any string found in df1 为此,我使用上面显示的代码进行了迭代,以检查并替换在df1中找到的任何字符串

The output belows shows the strings in df1 下面的输出显示df1中的字符串

wait_timeout
interactive_timeout
pool_recycle
....
__all__
folder_name
re.compile('he(lo') 

The output below shows the output after replacing strings in df2 下面的输出显示替换df2中的字符串后的输出

0   have you tried watching the traffic between th...
1   /dev/cu.xxxxx is the "callout" device, it's wh...
2               You'll want the struct package.\r\r\n

For the output in df2 strings like /dev/cu.xxxxx should have been replaced during the iteration but as shown it is not removed. 对于df2中的输出,如/dev/cu.xxxxx类的/dev/cu.xxxxx应在迭代过程中被替换,但如图所示,它并未被删除。 However, I have attempted using nlp2_df['title'] = nlp2_df['title'].replace("/dev/cu.xxxxx","") and managed to remove it successfully is there a reason why directly writing the string works but looping using a variable to use for replacing don't? 但是,我尝试使用nlp2_df['title'] = nlp2_df['title'].replace("/dev/cu.xxxxx","")并设法成功将其删除,是否存在直接写入字符串的原因但是循环使用变量来代替吗?

Thanks in advanced! 提前致谢!

IIUC you can simply use regular expressions: IIUC您可以简单地使用正则表达式:

nlp2_df['title'] = nlp2_df['title'].str.replace(r'([\(\)\[\]])',r'\\\1')

PS you don't need for loop at all... PS,您根本不需要for loop ...

Demo: 演示:

In [15]: df
Out[15]:
           title
0  aaa (bbb) ccc
1   A [word] ...

In [16]: df['new'] = df['title'].str.replace(r'([\(\)\[\]])',r'\\\1')

In [17]: df
Out[17]:
           title              new
0  aaa (bbb) ccc  aaa \(bbb\) ccc
1   A [word] ...   A \[word\] ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM