简体   繁体   English

具有多个查询的Jsoup Select / Jsoup select内部是否有OR操作?

[英]Jsoup Select with multiple queries / Is there OR operation inside Jsoup select?

I know that we can select multiple element by using: 我知道我们可以使用以下方法选择多个元素:

doc.select("div.myclass > p,h2");  // select p or h2 inside myclass

But how can I select something like this : 但是我如何选择这样的东西:

doc.select("div.myclass > p, h2" || "div.myclass > p > a");// this is a fake function

I want to select both (p, h2) inside myclass and (p > a) inside myclass 我想同时在myclass中选择(p,h2)和在myclass中选择(p> a)

If I only use 如果我只用

doc.select("div.myclass > p");

I cannot get the content of a inside p. 我无法获取内部p的内容。

How can I do that? 我怎样才能做到这一点?

Your assumption that doc.select("div.myclass > p,h2"); 您假设doc.select("div.myclass > p,h2"); will select (next to p elements) only h2 elements that are direct children of a div with class myclass is not correct. 仅选择(紧接p个元素)属于myclass类的div直接子元素的h2元素是不正确的。 The , operator in the Jsoup CSS implementation has precedence over the > operator. ,运营商在Jsoup CSS实现拥有优先>运营商。 So in your example it will select all h2 elements, regardless of where it is in the DOM. 因此,在您的示例中,它将选择所有h2元素,无论它在DOM中的位置如何。

String html = ""
        + "<h2>header1</h2>"
        + "<div class=\"myclass\">"
        + "<h2>header2</h2>"
        + "    <p>p1</p>"
        + "    <div class=\"myclass2\">"
        + "    <p>p2</p>"
        + "    </div>"
        + "</div>"
        ;
Document doc = Jsoup.parse(html);
Elements els1 = doc.select("div.myclass > p,h2");
System.out.println(els1+"\n");
Elements els2 = doc.select("div.myclass > p, div.myclass > h2");
System.out.println(els2+"\n");

In the above example you can see for yourself, that the output of els1 will include the h2 element that is not child of the div . 在上面的示例中,您可以自己看到els1的输出将包括h2元素,该元素不是div

To select all p elements that are inside the div.myclass , even if they are no direct children you can use the space operator: 要选择div.myclass内部的所有p元素,即使它们不是直接子元素,也可以使用空间运算符:

Elements ps = doc.select("div.myclass p");

This will result in in the following output with the html from my example above: 这将导致以上示例中的html产生以下输出:

<p>p2</p>
<p>p3</p>

Have a look at the JSoup documentation to get the meaning of other possible operators. 查看JSoup文档,以了解其他可能的运算符的含义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM