简体   繁体   中英

Android HTML Jsoup

Im trying to get the absolute URLs for images from my collages news website but have so far been unsuccessful. I am working from this site http://www.dcu.ie/news/index.shtml . As you can see from the source the first image has the absolute URL but the remainder only have the relative URLs. I have tried examples from Jsoups documentation but cant get it to work. This displays the first image and then empty boxes for the rest. I'd appreciate any help possible.Thanks

public class NewsActivity extends Activity {
    WebView mWebView;
    String test2 = "<html><body>";
    Document docs;
    public void main(String... args) 
        {
        try 
        {
        docs = Jsoup.connect("http://www.dcu.ie/news/index.shtml").get();
    } 
        catch (IOException e) 
        {
        e.printStackTrace();
    }
        Elements imgs = docs.select("img[src$=.jpg]");
        for (Element img : imgs) 
        {
            String url = img.toString();
            test2 = test2 + " " + url + " ";
        }
        public void onCreate(Bundle savedInstanceState) {
        main();

        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        mWebView = (WebView) findViewById(R.id.webview);
        mWebView.setWebViewClient(new NewsClient());
        mWebView.getSettings().setJavaScriptEnabled(true);
        mWebView.getSettings().setDomStorageEnabled(true);
        mWebView.loadData(test2, "text/html", "utf-8");
    }
}

You need Element#absUrl() to extract the absolute URL instead of Element#toString() to get the text representation of the whole HTML element.

Elements imgs = docs.select("img[src$=.jpg]");
for (Element img : imgs) {
    String url = img.absUrl("src");
    String newImg = "<img src=\"" + url + "\"/>";
    // ...
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM