[英]Jsoup parse HTML including span tags
我有以下格式的HTML
<article class="cik" id="100">
<a class="ci" href="/abc/1001/STUFF">
<img alt="Micky Mouse" src="/images/1001.jpg" />
<span class="mick vtEnabled"></span>
</a>
<div>
<a href="/abc/1001/STUFF">Micky Mouse</a>
<span class="FP">$88.00</span> <span class="SP">$49.90</span>
</div>
</article>
在上面的代碼中,文章內部的標簽的跨度為class =“ mick vtEnabled”,沒有標簽。 我想檢查在文章標簽中是否存在具有指定類名的span標簽。 我怎么做? 我嘗試了select(“> a [href]> span.mick vtEnabled”)並檢查了大小。所有文章標簽的大小均為0,無論是否設置。 有輸入嗎?
從單個article
標簽開始會很好:
final String test = "<article class=\"cik\" id=\"100\"><a class=\"ci\" href=\"/abc/1001/STUFF\"><img alt=\"Micky Mouse\" src=\"/images/1001.jpg\" /></a><div><a href=\"/abc/1001/STUFF\">Micky Mouse</a><span class=\"FP\">$88.00</span> <span class=\"SP\">$49.90</span></div></article>";
final Elements articles = Jsoup.parse(test).select("article");
for (final Element article : articles) {
final Elements articleImages = article.select("> a[href] > img[src]");
for (final Element image : articleImages) {
System.out.println(image.attr("src"));
}
final Elements articleLinks = article.select("> div > a[href]");
for (final Element link : articleLinks) {
System.out.println(link.attr("href"));
System.out.println(link.text());
}
final Elements articleFPSpans = article.select("> div > span.FP");
for (final Element span : articleFPSpans) {
System.out.println(span.text());
}
}
final Elements articleSPSpans = article.select("> div > span.SP");
for (final Element span : articleSPSpans) {
System.out.println(span.text());
}
}
打印:
/images/1001.jpg
/abc/1001/STUFF
Micky Mouse
$88.00
$49.90
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.