I am currently using Jsoup to parse a html. The code is quite simple:
Document doc = null;
try{
doc = Jsoup.connect(link).get();
}
catch (Exception e) {
//System.out.println("Some error occured.");
textView.setText(e.getMessage());
}
It do gives me the webpage I want, later I can extract the data I need from that webpage with it's getElementsByTag method and so on. However, I only want to use part of the webpage, for example, I wish to abandon everything after <. -- / foo --> in my webpage, (Actually It's does not have blank between < and.? but I can't type that here,) Is there any way of abandon the webpage after that string and get the new Document with only the part I want, I checked the cookbook. but it seems only process the webpage in it's structure. so I am not quite sure is it OK to do something like string remove. Thanks for your reading.
You can use Document doc = Jsoup.parse(html) where HTML is a page HTML. Ie take HTML first by
Connection connect = Jsoup.connect(url);
Connection.Response response = connect.execute();
String html = response.body();
then do whatever operations you need (eg cut HTML after marker, but add necessary closing HTML tags), then
Document doc = Jsoup.parse(html)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.