簡體   English   中英

jsoup圖像沒有被解析

[英]jsoup image is not getting parsed

我正在使用jsoup來回溯來自fallowing網頁的圖片{ http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/ prod.jump PPID = 180d97e&CATID = cat100240089&selectedLotId = 0514592&selectedSKUId = 05145920000&navState = navState-:CATID-cat100240089:subcatId-:subcatZone假:N-100240089%20158:NS-:淖-0:PS-24:PN-1:NTT - :Nf-:action-guided%20navigation&catId = SearchResults }我的代碼是

String url = "http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/prod.jump?ppId=180d97e&catId=cat100240089&selectedLotId=0514592&selectedSKUId=05145920000&navState=navState-:catId-cat100240089:subcatId-:subcatZone-false:N-100240089%20158:Ns-:Nao-0:ps-24:pn-1:Ntt-:Nf-:action-guided%20navigation&catId=SearchResults";


           Document doc= Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2").get();


          String imgUrl=doc.select("#mapImageSjElement4 img").attr("abs:src"); 

它應該返回我的圖像網址,但我沒有得到圖像url.any建議????? 我想要檢索網頁左側的主要產品圖片。

如果您打印整個文檔,您將看到該圖像和網站內的更多內容是由遍布頁面的javascript scrips加載的。 為了獲得該圖像,您必須在2之間進行選擇:

  1. 使用像Selenium,W​​ebdriver,HTTPClient這樣的無GUI web-broser; 並在完全加載頁面后,獲取它的HTML內容
  2. 通過研究它的代碼來模擬javascript,並檢索你想要的數據

這將是一種使用我提到的第二個方法而不向項目添加任何額外的lib的方法:

//Let's say you have the right script in a String
//variable named javascript.
String[] html = javascript.split("\n");

String imgUrl = "";
for(String line : html) {
    if (line.contains("imgUrl variable name here")) {
        imgUrl = line;
        break;
    }
}

//Now that you have what you want in a variable
//just split / substring it, untill you narrowed
//it down to what you want.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM