简体   繁体   English

如何在特定字符之前从字符串中删除特殊字符?

[英]How to remove special characters from a string before specific character?

I have a df that has a column called EMAIL , which contains various email addresses.我有一个df有一个名为EMAIL的列,其中包含各种 email 地址。 I want to remove all the special characters, specifically., -, and _ that come before @ and append a new column NEW_EMAIL .我想删除所有特殊字符,特别是在 @ 和 append 之前出现的 - 和 _ 新列NEW_EMAIL For example, if df['EMAIL'] = 'ab_cd_123@email.com' , I want df['NEW_EMAIL'] = 'abcd123@email.com' .例如,如果df['EMAIL'] = 'ab_cd_123@email.com' ,我想要df['NEW_EMAIL'] = 'abcd123@email.com'

I was able to remove periods successfully with my codes below, but cannot seem to remove underscore or dash in the same line of code.我能够使用下面的代码成功删除句点,但似乎无法在同一行代码中删除下划线或破折号。 Right now, I am repeating the same line of codes to remove those three special characters, which is quite ugly.现在,我正在重复同一行代码来删除这三个特殊字符,这非常难看。 Can someone lend me a hand please?有人可以帮帮我吗? Thank you for your help in advance.提前谢谢你的帮助。

df['NEW_EMAIL'] = df.EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)
df['NEW_EMAIL'] = df.NEW_EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)
df['NEW_EMAIL'] = df.NEW_EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)

You can use您可以使用

df['NEW_EMAIL'] = df['EMAIL'].str.replace(r'[._-](?=[^@]*@)', '', regex=True)

See the regex demo .请参阅正则表达式演示 Details :详情

  • [._-] - a . [._-] - 一个. , _ or - char , _-字符
  • (?=[^@]*@) - a positive lookahead that requires the presence of any zero or more chars other than @ and then a @ char immediately to the right of the current location. (?=[^@]*@) - 一个正向前瞻,要求存在除@之外的任何零个或多个字符,然后在当前位置右侧紧接一个@字符。

If you need to replace/remove any special char , you should use如果您需要替换/删除任何特殊 char ,您应该使用

df['NEW_EMAIL'] = df['EMAIL'].str.replace(r'[\W_](?=[^@]*@)', '', regex=True)

See a Pandas test:查看 Pandas 测试:

>>> import pandas as pd
>>> df = pd.DataFrame({'EMAIL':['ab_cd_123@email.com', 'ab_cd.12-3@email.com']})
>>> df['EMAIL'].str.replace(r'[._-](?=[^@]*@)', '', regex=True)
0    abcd123@email.com
1    abcd123@email.com
Name: EMAIL, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM