[英]What is the best way to scrape this HTML for an android app?
What is the best way to scrape the below HTML from a web page? 从网页中删除以下HTML的最佳方法是什么? I want to pull out Apple, Orange and Grape and put them into a dropdown menu in my Android app. 我想拉出Apple,Orange和Grape并将它们放入我的Android应用程序的下拉菜单中。 Should I use Jsoup for this, and if so, what would be the best way to do it? 我应该使用Jsoup,如果是这样,最好的方法是什么? Should I use Regex instead? 我应该使用正则表达式吗?
<select name="fruit" id="fruit" >
<option value="APPLE">Apple</option>
<option value="ORANGE">Orange</option>
<option value="GRAPE">Grape</option>
</select>
Depends, but I'd go with an XML/HTML parser. 取决于,但我会使用XML / HTML解析器。 Don't use regex . 不要使用正则表达式 。
Document doc = Jsoup.connect(someUrl).get();
Elements options = doc.select("select#fruit option");
More on jsoup selector syntax . 有关jsoup选择器语法的更多信息 。
I would go with either the built-in DOM parser or SAX parser . 我会使用内置的DOM解析器或SAX解析器 。 If you're going to be parsing a large document, SAX is faster. 如果您要解析大型文档,SAX会更快。 If the document is small, then there's not much difference. 如果文件很小,那就没什么区别了。 More on SAX vs DOM . 有关SAX与DOM的更多信息 。
For HTML parsing you can use jsoup. 对于HTML解析,您可以使用jsoup。 The usage is very easy and the API is great. 用法非常简单,API很棒。
http://jsoup.org/ http://jsoup.org/
For me it worked great! 对我来说它很棒!
EDIT: too slow :D skyuzo's post is great :) 编辑:太慢了:D skyuzo的帖子很棒:)
WebView is your friend: WebView是你的朋友:
http://developer.android.com/reference/android/webkit/WebView.html http://developer.android.com/reference/android/webkit/WebView.html
It let's you grab html as a browser, and then you can do stuff with it. 它让你抓住html作为浏览器,然后你就可以用它来做。 Take notice that it doensn't take into account javascript, so I hope that's plain html you have therem not some ajax fetched or js generated code :) 请注意,它没有考虑到javascript,所以我希望这是简单的HTML你没有一些ajax fetched或js生成的代码:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.