简体   繁体   中英

Jsoup not connecting to webpage in Android Studio

I am working on a project right now where I use jsoup in a class with the function retrieveMedia in order to return an ArrayList filled with data from the webpage. I run it in a thread since you shouldn't be connecting to URLs from the main thread. I run it and join it. However, it doesn't work (I tested the same code in Eclipse separate from Android Studio and it worked fine). It seems that no matter what I do I can't get jsoup to connect to the webpage. Below is my class MediaRetriever.

public class MediaRetreiever {

    public ArrayList<Media> retrieveMedia() {
        ArrayList<Media> mediaOutput = new ArrayList<Media>(); //Store each scraped post
        Thread downloadThread = new Thread(new Runnable() {
            public void run() {
                Document doc = null;
                try {
                    doc = Jsoup.connect(<Website Im connecting to>).timeout(20000).get();
                } catch (IOException e) {
                    System.out.println("Failed to connect to webpage.");
                    mediaOutput.add(new Media("Failed to connect", "oops", "", "oh well"));
                    return;
                }
                
                Elements mediaFeed = doc.getElementById("main").getElementsByClass("node");

                for (Element e : mediaFeed) {
                    String title, author, imageUrl, content;
                    title=e.getElementsByClass("title").text().trim();
                    author=e.getElementsByClass("content").tagName("p").select("em").text().trim();
                    content=e.getElementsByClass("content").text().replace(author,"").trim();
                    Media media = new Media(title, author, "", content);
                    mediaOutput.add(media);
                }
            }
        });
        downloadThread.start();
        try {
            downloadThread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        
        return mediaOutput;

    }
}

Running this class's method from another class and it doesn't ever connect. Any ideas?

Since you say that the problem persists only in Android, it looks like that you should add the user agent string to your request - first get the user agent string of a browser that displays correctly the site, and then add it to the request:

doc = Jsoup.connect(<Website Im connecting to>)
           .userAgent("your-user-agent-string")
           .timeout(20000).get();

And as a sidenote - if you are catching exception, don't print your own error message - print the original message, it may be very useful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM