[英]Python regex to remove emails from string
Need to replace emails in a string, so: 需要替换字符串中的电子邮件,因此:
inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111"
should result in: 应导致:
out = 'abc 123 foo bar"
What regex to use? 使用什么正则表达式?
In [148]: e = '[^\@]\@[^\@]'
In [149]: pattern = re.compile(e)
In [150]: pattern.sub('', s)
Out[150]: 'one aom 123 4two'
In [151]: s
Out[151]: 'one ab@com 123 4 @ two'
Does not work for me 对我不起作用
Replace : 更换:
\\S*@\\S*\\s?
by '' 通过''
Demo here 在这里演示
Some explanations : 一些解释:
\\S*
: match as many non-space characters you can \\S*
:可以匹配尽可能多的非空格字符
@
: then a @ @
:然后是@
\\S*
: then another sequence of non-space characters \\S*
:然后是另一个非空格字符序列
\\s?
: And eventually a space, if there is one. :最后有一个空格(如果有)。 Note that the '?' 请注意,“?” is needed to match an address at the end of the line. 需要与行尾的地址匹配。 Because of the greediness of '?', if there is a space, it will always be matched. 由于“?”的贪婪性,如果有空格,它将始终被匹配。
I personally prefer doing string parsing myself. 我个人更喜欢自己解析字符串。 Let's try splitting the string and getting rid of the items that have the @
symbol: 让我们尝试分割字符串并删除带有@
符号的项目:
inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111'
items = inp.split()
Now we can do something like this: 现在我们可以做这样的事情:
>>> [i for i in items if '@' not in i]
['abc', '123', 'foo', 'bar']
That gets us almost there. 那使我们快到了。 Let's modify it a bit more to add a join
: 让我们对其进行一些修改以添加join
:
>>> ' '.join([i for i in inp.split() if '@' not in i])
'abc 123 foo bar'
It may not be RegEx, but it works for the input you gave. 它可能不是RegEx,但适用于您输入的内容。
out = ' '.join([item for item in inp.split() if '@' not in item])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.