[英]How to extract HTML's <td> tag data using regex in Java?
I trying to read Username and Password from an Email using Java It is returning mail content in html format and I just wanted to extract Username and Password which is present under <td>
tag. 我试图使用Java从电子邮件中读取用户名和密码,它以html格式返回邮件内容,我只想提取
<td>
标签下的用户名和密码。 Below is my HTML code snippet - 以下是我的HTML代码段-
<table width="200">
<tbody>
<tr>
<td colspan="2">Your Account Details:</td>
</tr>
<tr>
<td>EmailId:</td>
<td><a class="moz-txt-link-abbreviated" href="mailto:jainish.m.kapadia@trimantra.net">jainish.m.kapadia@trimantra.net</a></td>
</tr>
<tr>
<td>Password:</td>
<td>C3mRXh+|n#1J</td>
</tr>
</tbody>
</table>
How do I achieve this? 我该如何实现?
Please don't try to parse HTML with RegEx, for a detailed answer on why you shouldn't try this see this SO answer . 请不要尝试使用RegEx解析HTML,以获取有关为什么不应该尝试使用此方法的详细答案,请参阅此SO 答案 。
You can use jsoup for parsing your HTML Strings like this: 您可以使用jsoup来解析HTML字符串,如下所示:
String html = "<html><head><title>First parse</title></head>"
+ "<body><p>Parsed HTML into a doc.</p></body></html>";
Document doc = Jsoup.parse(html);
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
jsoup also offers methods for hierarchical navigation like jsoup还提供了用于分层导航的方法,例如
siblingElements();
nextElementSibling();
and so on. 等等。
You can use below code snippet: 您可以使用以下代码段:
String str = "your html";
Pattern pattern = Pattern.compile("(<td>(.*?)<\\/td>)");
Matcher matcher = pattern.matcher(str);
This will give you back all the <td>
tag. 这将带回所有
<td>
标记。 Now you can loop through the matcher
and get your required string. 现在,您可以遍历
matcher
并获取所需的字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.