简体   繁体   中英

How do I store the links of a webpage in a set using Jsoup?

I am trying to store the HTML links of a webpage in a set. (Since sets don't allow duplicate strings?)

Followed by parsing the HTML from those links

Then Storing the parsed links in another set.

So far all I have gotten is this:

    public static void main(String[] args) throws IOException {

    Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/Matrix_(mathematics)").get();

    Elements links = doc.select("body a");

    System.out.println(links);

You can traverse the Element's in links with

for (Element e : links) {
    // called for every element, add them to a set if you wish
}

Since the Elements class implements the interface java.util.List<E> , there is an easy way to convert your collection of links to a set without duplicates.

This should work:

Set<Element> linkSet = new HashSet<Element>(links);

References:

JavaDocs HashSet - java.util.HashSet

Jsoup Docs Elements - org.jsoup.select.Elements

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM