简体   繁体   English

如何使用Jsoup从网页中提取多个电子邮件地址?

[英]How to extract multiple email addresses from a web page using Jsoup?

I have list of sites from which i need to go to contact page and extract the email ids for each and every site using jsoup. 我有一些站点列表,我需要从这些站点转到联系页面并使用jsoup提取每个站点的电子邮件ID。 I'm using the java.util.regex.Pattern to get email id code is shown below 我正在使用java.util.regex.Pattern获取电子邮件ID代码,如下所示

Matcher m = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+").matcher(doc.toString());
if (m.find()) {             
 email= m.group();              
 System.out.println(email);
}

I encountered a some website where it contains multiple email addresses but above code gets only one email id which has encountered as first. 我遇到了一个网站,其中包含多个电子邮件地址,但是上面的代码仅获得一个电子邮件ID,这是第一个遇到的。 I would like to get all the email id from that page. 我想从该页面获取所有电子邮件ID。

I tried using below code but still getting all the junk 我尝试使用下面的代码,但仍然收到所有垃圾

Elements elements =  doc.getElementsMatchingText(Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+"));
for(Element element: elements){
System.out.println(element.toString());
}

How can i get all the email ids from a web page? 如何从网页获取所有电子邮件ID? please help me. 请帮我。

You need to use while loop instead of if condition, so that it would do printing for each match. 您需要使用while循环而不是if条件,以便它将为每个匹配项进行打印。

while (m.find()) {             
 email= m.group();              
 System.out.println(email);
}

OR 要么

while (m.find()) {                         
     System.out.println(m.group());
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM