[英]Remove specific characters from a pandas column?
Hello I have a dataframe where I want to remove a specific set of characters 'fwd' from every row that starts with it.您好,我有一个数据框,我想从以它开头的每一行中删除一组特定的字符“fwd”。 The issue I am facing is that the code I am using to execute this is removing anything that starts with the letter 'f'.
我面临的问题是我用来执行此操作的代码正在删除以字母“f”开头的任何内容。
my dataframe looks like this:我的数据框看起来像这样:
summary
0 Fwd: Please look at the attached documents and take action
1 NSN for the ones who care
2 News for all team members
3 Fwd: Please take action on the action needed items
4 Fix all the mistakes please
When i used the code:当我使用代码时:
df['Clean Summary'] = individual_receivers['summary'].map(lambda x: x.lstrip('Fwd:'))
I end up with a dataframe that looks like this:我最终得到一个如下所示的数据框:
summary
0 Please look at the attached documents and take action
1 NSN for the ones who care
2 News for all team members
3 Please take action on the action needed items
4 ix all the mistakes please
I don't want the last row to lose the F in 'Fix'.我不希望最后一行在“修复”中丢失 F。
You should use a regex
remembering ^
indicates startswith:您应该使用
regex
记住^
表示开始:
df['Clean Summary'] = df['Summary'].str.replace('^Fwd','')
Here's an example:下面是一个例子:
df = pd.DataFrame({'msg':['Fwd: o','oe','Fwd: oj'],'B':[1,2,3]})
df['clean_msg'] = df['msg'].str.replace(r'^Fwd: ','')
print(df)
Output:输出:
msg B clean_msg
0 Fwd: o 1 o
1 oe 2 oe
2 Fwd: oj 3 oj
You are not only loosing 'F'
but also 'w'
, 'd'
, and ':'
.您不仅会丢失
'F'
,还会丢失'F'
'w'
、 'd'
和':'
。 This is the way lstrip
works - it removes all of the combinations of characters in the passed string. 这是
lstrip
工作方式- 它删除传递的字符串中的所有字符组合。
You should actually use x.replace('Fwd:', '', 1)
您实际上应该使用
x.replace('Fwd:', '', 1)
1 - ensures that only the first occurrence of the string is removed. 1 - 确保仅删除第一次出现的字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.