简体   繁体   中英

How do I catch "split" exceptions in python?

I am trying to parse a list of email addresses to remove the username and '@' symbol only leaving the domain name.

Example: blahblah@gmail.com Desired output: gmail.com

I have accomplished this with the following code:

for row in cr: 
    emailaddy = row[0]
    (emailuser, domain) = row[0].split('@')
    print domain

but my issue is when I encounter a improperly formatted email address. For example if the row contains "aaaaaaaaa" (instead of a valid email address) the program crashes with the error

(emailuser, domain) = row[0].split('@')
ValueError: need more than 1 value to unpack. 

(as you would expect) Rather than check all the email addresses for their validity, I would rather just not update grab the domain and move on to the next record. How can I properly handle this error and just move on?

So for the list of:

blahblah@gmail.com
mmymymy@hotmail.com
youououou
nonononon@yahoo.com

I would like the output to be:

gmail.com
hotmail.com

yahoo.com

Thanks!

You want something like this?

try:
    (emailuser, domain) = row[0].split('@')
except ValueError:
    continue

You can just filter out the address which does not contain @ .

>>> [mail.split('@')[1] for mail in mylist if '@' in mail]
['gmail.com', 'hotmail.com', 'yahoo.com']
>>>

What about

splitaddr = row[0].split('@')
if len(splitaddr) == 2:
    domain = splitaddr[1]
else:
    domain = ''

This even handles cases like aaa@bbb@ccc and makes it invalid ( '' ).

Try this

In [28]: b = ['blahblah@gmail.com',
 'mmymymy@hotmail.com',
 'youououou',
 'nonononon@yahoo.com']

In [29]: [x.split('@')[1] for x in b if '@' in x]
Out[29]: ['gmail.com', 'hotmail.com', 'yahoo.com']

This does what you want:

import re

l=["blahblah@gmail.com","mmymymy@hotmail.com",
   "youououou","nonononon@yahoo.com","amy@bong@youso.com"]

for e in l:
    if '@' in e:
       l2=e.split('@')
       print l2[-1]
    else:
       print

Output:

gmail.com
hotmail.com

yahoo.com
youso.com

It handles the case where an email might have more than one '@' and just takes the RH of that.

if '@' in row[0]:
    user, domain = row[0].split('@')
    print domain

We can consider the string not having '@' symbol, as a simple username:

try:
    (emailuser, domain) = row[0].split('@')
    print "Email User" + emailuser
    print "Email Domain" + domain
except ValueError:
    emailuser = row[0]
    print "Email User Only" + emailuser

O/P:
Email User : abc
Email Domain : gmail.com

Email User : xyz
Email Domain : gmail.com

Email User Only : usernameonly

Maybe the best solution is to avoid exception handling all together. You can do this by using the builtin function partition(). It is similar to split() but does not raise ValueError when the seperator is not found.
Read more:
https://docs.python.org/3/library/stdtypes.html#str.partition

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM