简体   繁体   中英

jsoup image is not getting parsed

i am using jsoup to retrive images from fallowing web page { http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/prod.jump?ppId=180d97e&catId=cat100240089&selectedLotId=0514592&selectedSKUId=05145920000&navState=navState-:catId-cat100240089:subcatId-:subcatZone-false:N-100240089%20158:Ns-:Nao-0:ps-24:pn-1:Ntt-:Nf-:action-guided%20navigation&catId=SearchResults } my code is

String url = "http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/prod.jump?ppId=180d97e&catId=cat100240089&selectedLotId=0514592&selectedSKUId=05145920000&navState=navState-:catId-cat100240089:subcatId-:subcatZone-false:N-100240089%20158:Ns-:Nao-0:ps-24:pn-1:Ntt-:Nf-:action-guided%20navigation&catId=SearchResults";


           Document doc= Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2").get();


          String imgUrl=doc.select("#mapImageSjElement4 img").attr("abs:src"); 

it should return me image url but i am not getting image url.any suggestions????? i want to retreive main product image which is on left side of web page.

If you print the whole document, you'll see that that image and much more within the website is loaded by javascript scrips scattered all over the page. In order to get that image, you'll have to choose between the 2:

  1. Use a GUIless web-broser like Selenium, Webdriver, HTTPClient; and upon full load of the page, get it's html content
  2. Emulate the javascript by studying it's code, and retrieve the data you want

That would be a way to use the 2nd approache I mentioned without adding any extra libs to your project:

//Let's say you have the right script in a String
//variable named javascript.
String[] html = javascript.split("\n");

String imgUrl = "";
for(String line : html) {
    if (line.contains("imgUrl variable name here")) {
        imgUrl = line;
        break;
    }
}

//Now that you have what you want in a variable
//just split / substring it, untill you narrowed
//it down to what you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM