Jsoup-改进从html页面提取图像

Question

I'm getting images from the web using these lines of code: 我正在使用以下代码行从网上获取图像：

for(int i=0; i<links.size(); i++){ 
        try{
            doc=Jsoup.connect(links.get(i)).userAgent("Mozilla").ignoreHttpErrors(true).timeout(0).get();
            Elements links=doc.getElementsByTag("img");
            imageLink=links.get(3).toString();
            String[] bits=imageLink.split("\"");
            imageLink=bits[1];
            System.out.println(imageLink);
            url=new URL(imageLink);
            image=ImageIO.read(url);
            images.add(image);
        }
        catch (IOException e) {
            e.printStackTrace();
        }
}

This code works great but it's really slow. 这段代码很好用，但是速度很慢。 I get like one image per second and I need at least half the time. 我每秒获得一张图像，而我至少需要一半的时间。 Is there anything I can do to improve it? 我有什么可以改善的吗？

Answer 1

You can replace this: 您可以替换为：

imageLink=links.get(3).toString();
String[] bits=imageLink.split("\"");
imageLink=bits[1];

With this: 有了这个：

imageLink = links.get(3).attr("src");

Read more about extracting attributes here: http://jsoup.org/cookbook/extracting-data/attributes-text-html 在此处阅读有关提取属性的更多信息： http : //jsoup.org/cookbook/extracting-data/attributes-text-html

Jsoup-改进从html页面提取图像

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-01-19 09:36:28

Jsoup-改进从html页面提取图像

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-01-19 09:36:28

解决方案1
1 已采纳 2014-01-19 09:36:28