简体   繁体   English

使用 Jsoup 解析 HTML

[英]Html parse with Jsoup

<div>
<div class = "main">
    <div class ="content">
        <div class="content_left">
            <div class="alisveris_context_box">
                <ul class = "sinema_list">
                    <li>
                        <a href="blabla/12" title="asd">
                            <img src="http://asd.jpg">
                            <span class ="cartoon">
                                Textaa
                            </span>

How can I get the href value ( blabla/12 in the example) and span value ( Textare in the example)?如何获得href值(示例中的blabla/12 )和span值( Textare中的Textare )?

Lets say your html is the follow.假设您的 html 如下。

 String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
    Document doc = Jsoup.parse(html);
    Element link = doc.select("a").first();

    String linkHref = link.attr("href"); // "http://example.com/"

link.attr("href") will have your link. link.attr("href") 会有你的链接。

Same for your span.你的跨度也一样。 Think for yourself :)为自己考虑;为自己想 :)

source: http://jsoup.org/cookbook/extracting-data/attributes-text-html来源: http : //jsoup.org/cookbook/extracting-data/attributes-text-html

Using Jsoup you can easily find out You will get span value by this使用 Jsoup 你可以很容易地找出你将通过这个获得跨度值

String st="<div> <div class = \"main\">     <div class =\"content\">        "
            + "<div class=\"content_left\">  <div class=\"alisveris_context_box\">"
            + "   <ul class = \"sinema_list\">  <li>  <a href=\"blabla/12\" title=\"asd\">"
            + "<img src=\"http://asd.jpg\">  <span class =\"cartoon\">     Textaa           </span>";
String spanValue=Jsoup.parse(st).text();

and href value by和 href 值由

String href=Jsoup.parse(st).getElementsByTag("a").attr("href");
Elements elements = Jsoup.parse(html).select("div[class=main] div[class=content] div[class=content_left] div[class=alisveris_context_box] ul[class=sinema_list] li a");

String href = elements.first().attr("href");
String spanText = elements.first().select("span[class=cartoon]").first().text();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM