简体   繁体   中英

Python regex to remove emails from string

Need to replace emails in a string, so:

inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111"

should result in:

out = 'abc 123 foo bar"

What regex to use?

In [148]: e = '[^\@]\@[^\@]'
In [149]: pattern = re.compile(e)
In [150]: pattern.sub('', s)  
Out[150]: 'one aom 123 4two'
In [151]: s
Out[151]: 'one ab@com 123 4 @ two'

Does not work for me

Replace :
\\S*@\\S*\\s?
by ''

Demo here

Some explanations :
\\S* : match as many non-space characters you can
@ : then a @
\\S* : then another sequence of non-space characters
\\s? : And eventually a space, if there is one. Note that the '?' is needed to match an address at the end of the line. Because of the greediness of '?', if there is a space, it will always be matched.

I personally prefer doing string parsing myself. Let's try splitting the string and getting rid of the items that have the @ symbol:

inp = 'abc user@xxx.com 123 any@www foo @ bar 78@ppp @5555 aa@111'
items = inp.split()

Now we can do something like this:

>>> [i for i in items if '@' not in i]
['abc', '123', 'foo', 'bar']

That gets us almost there. Let's modify it a bit more to add a join :

>>> ' '.join([i for i in inp.split() if '@' not in i])
'abc 123 foo bar'

It may not be RegEx, but it works for the input you gave.

out = ' '.join([item for item in inp.split() if '@' not in item])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM