简体   繁体   中英

Getting a block of text using Jsoup

Basically what I'm attempting to do is input the song and artist in the url which will then bring me to the page with the song's lyrics I'm then going to find the correct way to get those lyrics. I'm new to using Jsoup. So far the issue I've been having is I can't figure out the correct way to get the lyrics. I've tried getting the first "div" after the "b" but it doesn't seem to work out the way I plan.

public static void search() throws MalformedURLException {

    Scanner search = new Scanner(System.in);
    String artist;
    String song;

    artist = search.nextLine();
    artist = artist.toLowerCase();
    System.out.println("Artist saved");
    song = search.nextLine();
    song = song.toLowerCase();
    System.out.println("Song saved");
    artist = artist.replaceAll(" ", "");
    System.out.println(artist);
    song = song.replaceAll(" ", "");
    System.out.println(song);
    try {
        Document doc;
        doc = Jsoup.connect("http://www.azlyrics.com/lyrics/"+artist+"/"+song+".html").get();
        System.out.println(doc.title());

        for(Element element : doc.select("div")) {

            if(element.hasText()) {
                System.out.println(element.text());
                break;
            }

        }
    } catch (IOException e){
        e.printStackTrace();
    }


}

I don't know if this is consistent or not in all song pages, but in the page you have shown, the lyrics appear with the div element whose first attribute is margin. If this is consistent, you could try something on the order of...

Elements eles = doc.select("div[style^=margin]");         
System.out.println(eles.html());

Or if it's always the 6th div element with lyrics, you could use that:

Elements eles = doc.select("div");
if (eles.size() >= 6) {
    System.out.println(eles.get(6).html());
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM