简体   繁体   中英

How to extract multiple email addresses from a web page using Jsoup?

I have list of sites from which i need to go to contact page and extract the email ids for each and every site using jsoup. I'm using the java.util.regex.Pattern to get email id code is shown below

Matcher m = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+").matcher(doc.toString());
if (m.find()) {             
 email= m.group();              
 System.out.println(email);
}

I encountered a some website where it contains multiple email addresses but above code gets only one email id which has encountered as first. I would like to get all the email id from that page.

I tried using below code but still getting all the junk

Elements elements =  doc.getElementsMatchingText(Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+"));
for(Element element: elements){
System.out.println(element.toString());
}

How can i get all the email ids from a web page? please help me.

You need to use while loop instead of if condition, so that it would do printing for each match.

while (m.find()) {             
 email= m.group();              
 System.out.println(email);
}

OR

while (m.find()) {                         
     System.out.println(m.group());
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM