简体   繁体   English

Python正则表达式从字符串中删除电子邮件

[英]Python regex to remove emails from string

Need to replace emails in a string, so: 需要替换字符串中的电子邮件,因此:

inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111"

should result in: 应导致:

out = 'abc 123 foo bar"

What regex to use? 使用什么正则表达式?

In [148]: e = '[^\@]\@[^\@]'
In [149]: pattern = re.compile(e)
In [150]: pattern.sub('', s)  
Out[150]: 'one aom 123 4two'
In [151]: s
Out[151]: 'one ab@com 123 4 @ two'

Does not work for me 对我不起作用

Replace : 更换:
\\S*@\\S*\\s?
by '' 通过''

Demo here 在这里演示

Some explanations : 一些解释:
\\S* : match as many non-space characters you can \\S* :可以匹配尽可能多的非空格字符
@ : then a @ @ :然后是@
\\S* : then another sequence of non-space characters \\S* :然后是另一个非空格字符序列
\\s? : And eventually a space, if there is one. :最后有一个空格(如果有)。 Note that the '?' 请注意,“?” is needed to match an address at the end of the line. 需要与行尾的地址匹配。 Because of the greediness of '?', if there is a space, it will always be matched. 由于“?”的贪婪性,如果有空格,它将始终被匹配。

I personally prefer doing string parsing myself. 我个人更喜欢自己解析字符串。 Let's try splitting the string and getting rid of the items that have the @ symbol: 让我们尝试分割字符串并删除带有@符号的项目:

inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111'
items = inp.split()

Now we can do something like this: 现在我们可以做这样的事情:

>>> [i for i in items if '@' not in i]
['abc', '123', 'foo', 'bar']

That gets us almost there. 那使我们快到了。 Let's modify it a bit more to add a join : 让我们对其进行一些修改以添加join

>>> ' '.join([i for i in inp.split() if '@' not in i])
'abc 123 foo bar'

It may not be RegEx, but it works for the input you gave. 它可能不是RegEx,但适用于您输入的内容。

out = ' '.join([item for item in inp.split() if '@' not in item])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM