简体   繁体   English

在Android中抓取HTML

[英]Scrape Html in Android

I need to scrape a url in my android app. 我需要在我的Android应用中抓取一个网址。 The url returns this block of Html code below: 网址返回以下HTML代码块:

  <div id="main">
   <div id="header">
    <form action="/search_db.php" id="f1" method="GET">
    <div style="float:left; width:829px;">
    <span style="margin:15px;"><a href="http://mp3skull.com/"><img    src="http://mp3skull.com/img/logo.jpg" border="0" alt="mp3skull.com - mp3 downloads"    style="vertical-align:middle;" /></a></span>
    <input type="text" name="q" id="sfrm" autocomplete="off" value="feel good inc gorillaz"   style="font-size:18px; vertical-align:middle; width:470px;"> 
    <input type="hidden" name="fckh" value="c1935e9a779034dec31fe7117c456eb8">
    <input type="submit" id="search_button" value="Search" style="font-size:18px; vertical-align:middle;">
    </div>
    <div style="float:left; text-align:right;">
    </div>
    <div style="clear:both;"></div>
    </form><script type="text/javascript">document.getElementById('sfrm').focus();InstallAC(document.getElementById('f1'), document.getElementById('sfrm'), document.getElementById('search_button'), '', 'en');</script>
</div>

Kindly show me an example of how to extract the values of the returned html code in java 请给我一个示例,说明如何在Java中提取返回的html代码的值

Using jsoup . 使用jsoup

Document doc = Jsoup.connect("http://your/url/here").get(); // or Jsoup.parse(htmlString);
Elements header = doc.select("#header"); //access to <div id="header">...</div>
    Elements inputs = header.select("input");
    for(Element input : inputs){
        System.out.println(input); //print <input>....</input>
        System.out.println(input.attr("id")); //printing attribute id
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM