在数据框中搜索子字符串并将其替换

Question

I have a condition where spurious data is created and I am trying to clean it. 我有创建虚假数据的情况，我正在尝试清理它。

eg... 例如...

www.one@foxturn.com/!ut/5 #RealLink
www.one@foxturn.com/ut1/5_RTFDEERERTGFEFD # System adds junks to it
www.one@foxturn.com/ut1/5_dvkerfddfrejermsdkasmf # System adds junks to it

I am trying to clean this up by dropping everything after !ut 我正在尝试通过删除!ut之后的所有内容来清理此问题

So far I have tried : 到目前为止，我已经尝试过：

SPA_MX = Mexico['Page URL'].str.startswith("http://www.www.one@foxturn.com/ut1")

but this returns a boolean. 但这返回一个布尔值。

I would like advise on the most efficient way to achieve this. 我想建议最有效的方法来实现这一目标。

Answer 1

You can do this using apply on the column and then use find to return the index of the pattern and slice the str if found: 您可以在列上使用apply来执行此操作，然后使用find返回模式的索引并切片str（如果找到）：

In[69]:

df['url'].apply(lambda x: x[:x.find('!ut') + 3] if x.find('!ut') != -1 else x)

Out[69]: 
0                             www.one@foxturn.com/!ut
1           www.one@foxturn.com/ut1/5_RTFDEERERTGFEFD
2    www.one@foxturn.com/ut1/5_dvkerfddfrejermsdkasmf
Name: url, dtype: object

Answer 2

my_string="www.one@foxturn.com/!ut/5"
final =  my_string.split("!ut")[0]

output: 输出：

www.one@foxturn.com/ www.one@foxturn.com/

在数据框中搜索子字符串并将其替换

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-07-04 13:56:17

解决方案2
1 2017-07-04 14:00:15

在数据框中搜索子字符串并将其替换

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-07-04 13:56:17

解决方案2 1 2017-07-04 14:00:15

解决方案1
1 已采纳 2017-07-04 13:56:17

解决方案2
1 2017-07-04 14:00:15