简体   繁体   中英

Regular expression for splitting email addresses (in Java)

I am hoping that there might be an easy way to do this, I am assuming regular expressions. Whats the best way in java to split the following string into email addresses?

bob@home.com, "Jane" <jane@home.com>, "Smith, Mr" <smith@home.com>

The fact that a comma can appear within the double quotes makes it somewhat more difficult. I guess ideally it would also work with single quotes?

bob@home.com, 'Jane, Ms' <jane@home.com>, "Smith, Mr" <smith@home.com>

I thought it would be good to check if there is an easier way save having to write a full parser!

Most will be handled by:

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

Though for full RFC-2822 compliance use:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

Both from regular-expressions.info , with discussion on where it falls short of "perfect".

In Java, just keep repeating to find only the email addresses without the names.

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Main {
    public static void main(String[] args) {
        new Main().findEmails("bob@home.com, \"Jane\" <jane@home.com>, \"Smith, Mr\" <smith@home.com>");
    }
    public void findEmails(String s) {
        System.out.println("ready: "+s);
        Pattern p = Pattern.compile("\\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}\\b",
                                    Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(s);
        while (m.find())
            System.out.println("Found: "+m.group());
    }
}

From Chadwick's link, a regex correct for RFC2822:

(?:[a-z0-9!#$%&'*+/=?^_ {|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_ {|}~-]+)*|"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM