简体   繁体   中英

how to extract email id using jsoup?

Elements elements = doc.select("span.st"); 
for (Element e : elements) {        
out.println("<p>Text : " + e.text()+"</p>");
}

Element e contains text with some email id in it. How to extract the maild id from it. I have seen the Jsoup API doc which provides :matches(regex) , but I didn't understand how to use it. I'm trying to use

^[a-zA-Z0-9_!#$%&'*+/=?`{|}~^.-]+@[a-zA-Z0-9.-]+$

which I found while googling.

Thank in advance for your help.

:matches(regex) is useful if you want to find something based on a specified regex (eg find all nodes that contain email).

I think this is not what you want. Instead, you need to extract the email from e.text() using regex . In your case:

Elements elements = doc.select("span.st"); 
for (Element e : elements) {        
    out.println("<p>Text : " + e.text()+"</p>");
    out.println(extractEmail(e.text()));
}

// ...
public static String extractEmail(String str) {
   Matcher m = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-    9-.]+").matcher(str);
   while (m.find()) {
       return m.group();
   }
   return null;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM