简体   繁体   中英

Parse a comma separated list of emails in Python which are of the format “Name” <email>

Input (comma separated list):

"\"Mr ABC\" <mr@abc.com>, \"Foo, Bar\" <foo@bar.com>, mr@xyz.com"

Expected output (list of 2-tuples):

[("Mr ABC", "mr@abc.com"), ("Foo, Bar", "foo@bar.com"), ("", "mr@xyz.com")]

I could actually use comma splitting and then use email.utils.parseaddr(address) until I realized that the name part can also have comma in it, like in "Foo, Bar" above.

email.utils.getaddresses(fieldvalues) is very close to what I need but it accepts a sequence, not a comma separated string.

You may use the following

import re
p = re.compile(r'"([^"]+)"(?:\s+<([^<>]+)>)?')
test_str = '"Mr ABC" <mr@abc.com>, "Foo, Bar" <foo@bar.com>, "mr@xyz.com"'
print(re.findall(p, test_str))

Output: [('Mr ABC', 'mr@abc.com'), ('Foo, Bar', 'foo@bar.com'), ('mr@xyz.com', '')]

See IDEONE demo

The regex matches...

  • " - a double quote
  • ([^"]+) - (Group 1) 1 or more characters other than a double quote
  • " - a double quote

Then, an optional non-capturing group is introduced with (?:...)? construct: (?:\\s+<([^<>]+)>)? . It matches...

  • \\s+ - 1 or more whitespace characters
  • < - an opening angle bracket
  • ([^<>]+) - (Group 2) 1 or more characters other than opening or closing angle brackets
  • > - a closing angle bracket

The re.findall function gets all capture groups into a list of tuples:

If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.


In case you need to make sure the email is the second element in the tuple, use this code (see demo ):

lst = re.findall(p, test_str)
print([(tpl[1], tpl[0]) if not tpl[1] else tpl for tpl in lst])
# => [('Mr ABC', 'mr@abc.com'), ('Foo, Bar', 'foo@bar.com'), ('', 'mr@xyz.com')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM