简体   繁体   English

如何使用 javax 邮件从 email 内容中获取“HREF”链接?

[英]How to fetch the "HREF" link from email content using javax mail?

Scenario: I need to click on the registration link from gmail via javax mail.场景:我需要通过javax邮件点击gmail的注册链接。

Here my scenario is I need to separate href content from the whole mail content which we fetch using inspect the element.在这里我的场景是我需要将 href 内容与我们使用检查元素获取的整个邮件内容分开。

From the below picture, I am getting the whole html tag for "Getting Started Today" link using inspect the element.从下图中,我使用 inspect 元素获取“今天开始”链接的整个 html 标签。

Inspect Element of Getting Started Today link:检查今天入门链接的元素:

https://u9738139.ct.sendgrid.net/ls/click?upn=BPEdlQPL1bTBzMFJ4T-2FMSCLqyVGH4nH5Cfahbthey41XetxY34HDkr5T5zC4sod3uaKeK1sQ2hv3M8UWc0NU3Isz-2BeKa5UsULu9-2BP4LsdIee-2B67fC7jeXcHr1-2B6Nk7slXqar_cKXYbNIReP0b0mWRGpcgiH39UX-2BY091vJss-2F-2BFxybEmov93OKh5iqnOTsasYycySJEisyJxL-2FH3KxF0AqK76x5GNPM3X-2BczMI499TE-2FdRCi8AvcFbI9P3kemV1Cr-2BOQx3UHM0t5EVj4MOXcGk0jdl-2Bn80JT8bY3WlJj9EgQeCGNF1y1eNtROlvSfI3aEuUTCl1UicfnUgpNy2fSJrtGsxNBHSbVHrpTml-2FTO-2F6jUHBc-3D

Now when I try to fetch the whole content of "Getting started Today" email, i got the below value:

 

3D"https://u9738139.ct.sendgrid.net/ls/click?upn=3DBPEdlQPL1bTBzMF=J4T-2FMSCLqyVGH4nH5Cfahbthey41XetxY34HDkr5T5zC4sod3uaKeK1sQ2hv3M8UWc0NU3Isz=-2BeKa5UsULu9-2BP4LsdIee-2B67fC7jeXcHr1-2B6Nk7slXqar_cKXYbNIReP0b0mWRGpcgiH=39UX-2BY091vJss-2F-2BFxybEmov93OKh5iqnOTsasYycySJEisyJxL-2FH3KxF0AqK76x5GNP=M3X-2BczMI499TE-2FdRCi8AvcFbI9P3kemV1Cr-2BOQx3UHM0t5EVj4MOXcGk0jdl-2Bn80JT8=bY3WlJj9EgQeCGNF1y1eNtROlvSfI3aEuUTCl1UicfnUgpNy2fSJrtGsxNBHSbVHrpTml-2FTO-=2F6jUHBc-3D" 3D"https://u9738139.ct.sendgrid.net/ls/click?upn=3DBPEdlQPL1bTBzMF=J4T-2FMSCLqyVGH4nH5Cfahbthey41XetxY34HDkr5T5zC4sod3uaKeK1sQ2hv3M8UWc0NU3Isz=-2BeKa5UsULu9-2BP4LsdIee-2B67fC7jeXcHr1-2B6Nk7slXqar_cKXYbNIReP0b0mWRGpcgiH=39UX-2BY091vJss-2F-2BFxybEmov93OKh5iqnOTsasYycySJEisyJxL-2FH3KxF0AqK76x5GNP=M3X-2BczMI499TE-2FdRCi8AvcFbI9P3kemV1Cr -2BOQx3UHM0t5EVj4MOXcGk0jdl-2Bn80JT8=bY3WlJj9EgQeCGNF1y1eNtROlvSfI3aEuUTCl1UicfnUgpNy2fSJrtGsxNBHSbVHrpTml-2FTO-=2F6jUHBc-3D"

Now I am able to retrieve the href link from the email but when i compare both the above HREF(From inspect element and extracted via javax mail), extra characters like "3D" "=" are appended.现在我可以从 email 检索 href 链接,但是当我比较上面的 HREF(从检查元素和通过 javax 邮件提取)时,会附加额外的字符,如“3D”“=”。 So while trying the click the link it is showing wrong link.因此,在尝试单击链接时,它显示错误的链接。

So I need a solution like所以我需要一个解决方案

  1. I need to retrieve from inspecting element via Javax Mail.我需要通过 Javax Mail 从检查元素中检索。

Kindly suggest a solution to resolve the issue.请提出解决问题的解决方案。

enter image description here Thanks,在此处输入图片描述谢谢,

Get the content and then use ReGex to parse the URL from String content:获取内容,然后使用 ReGex 从 String 内容中解析出 URL:

Pattern p = Pattern.compile("\\b(http|https)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]",Pattern.CASE_INSENSITIVE);
Matcher matcher = p.matcher(stringToSearch);
String foundUrl = "";
while (matcher.find()) {
  int matchStart = matcher.start(1);
  int matchEnd = matcher.end();
  foundUrl = stringToSearch.substring(matchStart, matchEnd);
}
System.out.println(foundUrl);

Another way to parse URL from HTML:从 HTML 解析 URL 的另一种方法:

String stringToSearch = "<p>An http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(stringToSearch);
String link = doc.select("a").attr("href");
System.out.println(link);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM