I am unable to get main image and name for products at Amazon or Flipkart using Jsoup.
My java/jsoup code for the same is:
// For amazon
Connection connection = Jsoup.connect(url).timeout(5000).maxBodySize(1024*1024*10);
Document doc = connection.get();
Elements imgs = doc.select("img#landingImage");
Elements names = doc.select("span#productTitle");
// For flipkart
Connection connection = Jsoup.connect(url).timeout(5000).maxBodySize(1024*1024*10);
Document doc = connection.get();
Elements imgs = doc.select("h1.title");
Elements names = doc.select("img.productImage.current");
Can someone please point out what am I missing here?
URLs I have used are:
and
Also, I would like to do this parsing on the front end if possible using javascript and jquery.
Is there a way to do the same?
Found out the issue.
Jsoup in GAE works when we use the URL fetch service using java.net.URL as:
private String read(String url) throws IOException
{
URL urlObj = new URL(url);
BufferedReader reader = new BufferedReader(new InputStreamReader(urlObj .openStream()));
String line;
StringBuffer sbuf = new StringBuffer();
while ((line = reader.readLine()) != null) {
if (line.trim().length() > 0)
sbuf.append(line).append("\n");
}
reader.close();
return sbuf.toString();
}
And then you use regular Jsoup as:
String html = read(url);
Document doc = Jsoup.parse(html);
Doing the above works very well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.