简体   繁体   中英

Weather website jsoup java

I have the following code

`

import java.io.IOException;

 import org.jsoup.*;
 import org.jsoup.nodes.Document;
 import org.jsoup.nodes.Element;
 import org.jsoup.select.Elements;
 import java.io.*;
 public class da {

/**
 * @param args
 */
public static void main(String[] args) {
    try {


            Document doc=Jsoup.connect("http://www.vremea.net/").get();
            Elements e=doc.select(".homeContent>ul>li ");
            PrintStream ps=new PrintStream(new FileOutputStream("io"));
                for(int i=0;i<e.size();i++)
                    {ps.println(e.get(i).text());
                System.out.println(e.get(i).text());}


    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }


}

}

` I want to access this website http://www.vremea.net/ and there is "home content" which have more "ul" each one with "li". My code goes through all "ul" but as you can see on the website it don't get those "ul" in order( i get second column ,third,forth and then first) and my question is why it goes in this order?

EDIT: What could be a more generic way of doing this (i mean if the owner of this site wants to modify the structure to be able to get this list like i did here without modify the code)?

As your code seemed correct, I took your code (a little bit reformatted)

public class NewClass {
    public static void main(String[] args) {
        try {
            Document doc = Jsoup.connect("http://www.vremea.net/").get();
            Elements e = doc.select(".homeContent>ul>li ");
            PrintStream ps = new PrintStream(new FileOutputStream("io"));
            for (int i = 0; i < e.size(); i++) {
                ps.println(e.get(i).text());
                System.out.println(e.get(i).text());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

and executed it with Jsoup 1.9.2 and 1.8.3 on Java8u60, Java8u91 and Java7.

On every execution, I get the following output which I assume is exactly what you are looking for:

• Bucuresti
• Adjud
• Aiud
• Alba Iulia
• Alexandria
• Arad
• Bacau
• Baia Mare
• Bailesti
• Barlad
• Beius
• Bistrita
• Blaj
• Botosani
• Brad
• Braila
• Brasov
• Buzau
• Calafat
• Calarasi
• Campia Turzii
• Campina
• Campulung Moldovenesc
• Campulung-Muscel
• Caracal
• Caransebes
• Carei
• Cluj-Napoca
• Codlea
• Constanta
• Craiova
• Curtea de Arges
• Dej
• Deva
• Dorohoi
• Dragasani
• Drobeta-Turnu Severin
• Fagaras
• Falticeni
• Fetesti
• Focsani
• Galati
• Gheorgheni
• Gherla
• Giurgiu
• Hunedoara
• Husi
• Iasi
• Lugoj
• Lupeni
• Mangalia
• Marghita
• Medgidia
• Medias
• Miercurea Ciuc
• Moinesti
• Moreni
• Motru
• Odorheiu Secuiesc
• Oltenita
• Onesti
• Oradea
• Orastie
• Orsova
• Pascani
• Petrosani
• Piatra Neamt
• Pitesti
• Ploiesti
• Radauti
• Ramnicu Sarat
• Ramnicu Valcea
• Reghin
• Resita
• Roman
• Rosiori de Vede
• Sacele
• Salonta
• Satu Mare
• Sebes
• Sfantu Gheorghe
• Sibiu
• Sighetu Marmatiei
• Sighisoara
• Slatina
• Slobozia
• Suceava
• Targoviste
• Targu Jiu
• Targu Mures
• Targu Secuiesc
• Tarnaveni
• Tecuci
• Timisoara
• Toplita
• Tulcea
• Turda
• Turnu Magurele
• Urziceni
• Vaslui
• Vatra Dornei
• Vulcan
• Zalau
• Zimnicea

So I am unable to reproduce the behaviour you are describing. You might want to try a different or more up-to-date version of Jsoup (or even Java) and test if your problems persists.

Even though you found the issue, just wanted to point out that Document.select() returns Elements , which has ArrayList as a superclass. You can iterate over it directly.

    for (Element item : doc.select(".homeContent > ul > li > a"))
        System.out.println(item.ownText());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM