简体   繁体   中英

how to know if the Element class is empty or null in Jsoup

I'm trying to gather the images of a list of articles using the Jsoup library from this url.

When an article hasn't an image embedded I use a standard picture. Here is what I do:

for(Element img : document.select(".rullo .rullo-item .lazy>a img[src]")) {
    String imageMainUrl = img.attr("src");
    if(img.attr("src") == null || img.attr("src").equals("") || 
           img.attr("src").isEmpty()){
        images.add(bmp);
    } else {
        String newString = imageMainUrl.replace("data:image/gif;base64,", "");
        byte[] decodedString = Base64.decode(imageMainUrl, Base64.DEFAULT);
        Bitmap bitmap = BitmapFactory.decodeByteArray(decodedString, 0, decodedString.length);                  
        images.add(bitmap);
    }                   
}

But the problem is that it never enter in the if section; how can I know if the element is empty or null? Thanks!

How about this:

for(Element img : doc.select(".rullo .rullo-item .lazy>a img[src]")) {
        String imageMainUrl = img.attr("data-src");
        URL imageurl = null;
        try {
            imageurl = new URL(imageMainUrl);
        } catch (MalformedURLException e) {
            e.printStackTrace();
        }
        final long imageSize = imageurl.openConnection().getContentLength();

        if(img.attr("src") == null || img.attr("src").equals("") ||
                img.attr("src").isEmpty() || imageSize > 0){
            images.add(bmp);
        } else {
            byte[] decodedString = Base64.decode(imageMainUrl, Base64.DEFAULT);
            Bitmap bitmap = BitmapFactory.decodeByteArray(decodedString, 0, decodedString.length);
            images.add(bitmap);
        }
    }

I noticed that the actual image url is under data-src and not src.

I solved my problem!
Here is the code:

for(Element picture : document.select(".rullo .rullo-item picture")) {
    Elements imageElement = picture.getElementsByClass("attachment-rullo");
    String imageUrl = imageElement.attr("data-src");
    if(imageUrl == null || imageUrl.equals("")){
        images.add(bmp);
    } else {
        try{
            InputStream input = new java.net.URL(imageUrl).openStream();
            Bitmap bitmap = BitmapFactory.decodeStream(input);
            images.add(bitmap);
        } catch (MalformedURLException e) {
            e.printStackTrace();
            images.add(bmp);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
            images.add(bmp);
        }
    }
}

I was able to solve my my problem by selecting the parent of the image: if the parent has no content the article doesn't have an picture embedded otherwise it has a picture embedded.

Try to print out the imageMainUrl using Log.i("value",imageMainUrl); you will traverse the problem like that. and also post html DOM of the articles from where you fetch the images.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM