简体   繁体   English

Jsoup-从网站获取某些属性

[英]Jsoup- getting certain attributes from website

Recently I've started with Jsoup and ve found this sample code.Because I'm newb I can't figure out how does this find all links from website.Could anyone explain me what happens in for loop? 最近我开始使用Jsoup并找到了此示例代码。因为我是newb,所以我不知道如何找到网站上的所有链接。有人可以解释一下for循环会发生什么吗? Mostly, I've never used this syntax of for loop before, so it's little bit confusing for me.I don't certainly understand what loop contains.Thank you! 通常,我之前从未使用过for循环的这种语法,所以这对我来说有点困惑。我不确定是什么包含了循环,谢谢!

    Elements links = doc.select("a[href]");
    for (Element link : links) {

        // get the value from href attribute
        System.out.println("\nlink : " + link.attr("href"));
        System.out.println("text : " + link.text());

    }

This is because Elements implements Iterable<Element> 这是因为Elements实现Iterable<Element>

( org.jsoup.select.Elements and java.lang.Iterable ) org.jsoup.select.Elementsjava.lang.Iterable

So when you use the for syntax you loop over your Elements links, which is effectively a List of type Element . 因此,当您使用for语法时,将遍历Elements链接,这实际上是Element类型的List The "Element link" is the local variable assigned to each element in 'links' as you iterate. “元素链接”是迭代时分配给“链接”中每个元素的局部变量。

For further information, see: 有关更多信息,请参见:

http://jsoup.org/apidocs/org/jsoup/select/Elements.html and http://jsoup.org/apidocs/index.html http://jsoup.org/apidocs/org/jsoup/select/Elements.htmlhttp://jsoup.org/apidocs/index.html

As the names suggest, the classes Elements and Element are similar. 顾名思义, Elements类和Element类是相似的。 One consist of a single element that has been selected, and the other one is a collection of multiple elements that are grouped together. 一个由选定的单个元素组成,另一个是分组在一起的多个元素的集合。

Elements links consists of Element -objects that have been selected. Elements links由已选择的Element对象组成。

The Elements class implements the following interfaces in java: Elements类在Java中实现以下接口:

Cloneable, Iterable< Element>, Collection< Element>, List< Element>.

The Elements class is implemented using ArrayList<Element> , thus it is easy to add and remove Element objects from the Elements collection. Elements类是使用ArrayList<Element> ,因此很容易从Elements集合中添加和删除Element对象。

When it comes to the for -loop it is a simple way to iterate over each Element object in the Elements collection called links. for循环,这是一种遍历Elements集合中称为链接的每个Element对象的简单方法。

The loop will iterate through the collection, and assign the variable link to the current Element object in the collection called links . 循环将遍历集合,并将变量link分配给集合中当前名为links Element对象。 Inside the for -loop the content of the current link will be printed, and the loop starts over with the next Element object in the collection. for循环内,将打印当前link的内容,然后循环从集合中的下一个Element对象开始。


The syntax of this loop is often called a for-each loop, since it iterates over each object in a list or collection. 此循环的语法通常称为for-each循环,因为它会遍历列表或集合中的每个对象。

Read up on it here ! 这里阅读!


Look through the Jsoup API docs to learn more about how to use it! 查看Jsoup API文档,以了解更多有关如何使用它的信息!


If you want to learn more about how Jsoup is implemented, take a look at the source code ! 如果您想了解有关如何实现Jsoup的更多信息,请查看源代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM