简体   繁体   中英

extracting a element from jsoup for a text value match in the element attribute

How do I get the span with a certain text within an attribute? I am trying to extract the number that comes after the text "stars". So how can I select a span tag that has text "rating_sprite stars" and I want the value "star5" to be extracted from the attribute so that I can get 5 out of the text.

Currently I dont get any elements back!

 String url = "https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000WYA1";
        Document doc = null;
        try {
            doc = Jsoup.connect(url).get();
        } catch (IOException e) {
            e.printStackTrace();
        }
        Elements spans = doc.select("span.rating_sprite");
        System.out.println(spans);

the HTML snippet looks something like this

<div class="snapshotTitleBox">
  <h1>Comgest</h1>
  <span class="rating_sprite stars5"></span>
  <span class="rating_sprite analyst-rating-5"></span>
  <div style="float:right; margin-top:6px;"></div>
</div>

There's no way to do this using only jsoup. But you already have all span.rating_sprite elements so you can iterate over each of them and find one with class matching regular expression stars(\\d) . Then you can capture the first group containing only the number:

    Pattern p = Pattern.compile("stars(\\d)");
    for (Element span : spans) {
        for (String className : span.classNames()) {
            Matcher m = p.matcher(className);
            if (m.matches()) {
                System.out.println("stars: " + m.group(1));
            }
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM