简体   繁体   English

Jsoup Element.select regex中的长度

[英]Length in Jsoup Element.select regex

is the length part of the regex in Jsoup's Element.select supposed to work? Jsoup的Element.select中的正则表达式的长度部分应该起作用吗? I am trying to find paragraph elements with content that is between 3 and 30 characters long, but it doesn't seem to work. 我试图找到内容长度在3到30个字符之间的段落元素,但是它似乎不起作用。 I am doing it like this: 我这样做是这样的:

Elements e = doc.select("p:matchesOwn({3,30}");

It seems to return all p elements, no matter how long they are. 似乎返回所有p个元素,无论它们有多长。

What am I not getting? 我没有得到什么?

I don't know if this is what your looking for but i have found that when working with Jsoup it "selects" all elements. 我不知道这是否是您要寻找的东西,但我发现在使用Jsoup时,它“选择”了所有元素。 instead try and do it as a two step thing 而是尝试将其作为两步操作

  1. Select all elements 选择所有元素
  2. loop through them and find the ones that has a length between 3 and 30 遍历它们并找到长度在3到30之间的那些

like such: 像这样:

Elements e = doc.select("p")
for (Elements paragrah: e) {
    if(paragrah.toString().length() > 3 && paragrah.toString().length <= 30){

    }
}

The following regex works for me: 以下正则表达式适用于我:

String html = "<p>aaa</p>" +
        "<p>bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb</p>" +
        "<p>cccc</p>" +
        "<p>d</p>";

Document doc = Jsoup.parse(html);
Elements e = doc.select("p:matches(^.{3,30}$)");

System.out.println(e);

Which outputs: 哪个输出:

<p>aaa</p>
<p>cccc</p>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM