简体   繁体   English

如何用pandas中的另一个字符串替换部分电子邮件地址?

[英]How to replace part of email address with another string in pandas?

I have a dataframe with email addresses. 我有一个带有电子邮件地址的数据框。 I need to replace every ending of email address with a '.mn'. 我需要用'.mn'替换每个电子邮件地址的结尾。 What I mean by ending is '.org', '.com', etc. 结尾的意思是'.org','。com'等。

Ex. John@smith.com becomes John@smith.mn

Not sure what I am doing wrong. 不确定我做错了什么。

This is what I have so far, but this is not replacing or giving me an error message: 这是我到目前为止,但这不是替换或给我一个错误信息:

email['ADDR'] = email['ADDR'].str.replace(r'[.]{2,}', '.mn')

Thank you in advance. 先感谢您。

This should do: 这应该做:

email['ADDR'] = email['ADDR'].str.replace('.{3}$', 'mn')

If you need to handle variable length domains ( .edu , .com1 , and so on), you can use: 如果需要处理可变长度域( .edu.com1等),可以使用:

email

             ADDR
0  john@smith.com
1    test@abc.edu
2    foo@bar.abcd

email['ADDR'].str.replace('\..{2,}$', '.mn')

0    john@smith.mn
1      test@abc.mn
2       foo@bar.mn
Name: ADDR, dtype: object

Another method which will handle variable length top-level endings is to use str.rsplit : 另一种处理可变长度顶级结尾的方法是使用str.rsplit

In[72]:
df = pd.DataFrame({'email':['John@smith.com','John@smith.x','John@smith.hello']})
df

Out[72]: 
              email
0    John@smith.com
1      John@smith.x
2  John@smith.hello

In[73]:
df['email'] = df['email'].str.rsplit('.').str[0] +'.mn'
df

Out[73]: 
           email
0  John@smith.mn
1  John@smith.mn
2  John@smith.mn

This will find the last trailing dot, takes the left hand side and append the new desired suffix 这将找到最后一个尾随点,左侧并附加新的所需后缀

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM