简体   繁体   English

使用 Jsoup 从 HTML 页面中提取数据

[英]Extracting data from HTML page using Jsoup

I'm trying to get my level of each skill from https://secure.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Feed%20Meh%20Dog我正在尝试从https://secure.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Feed%20Meh%20Dog获取我的每项技能水平

It's a table but I don't see a table id anywhere.这是一张桌子,但我在任何地方都看不到桌子 ID。 I just need to know what id or class I should be using.我只需要知道我应该使用什么 id 或 class。

Tried multiple tutorials, but all have a straight forward table class or id.尝试了多个教程,但都有一个简单的表类或 ID。 There is a div ID which I think I should use, just not sure how to extract each specific row/skill.我认为我应该使用一个 div ID,只是不确定如何提取每个特定的行/技能。

final Document document = Jsoup.connect(" https://secure.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Feed%20Meh%20Dog ").get();最终文档文档 = Jsoup.connect(" https://secure.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Feed%20Meh%20Dog ").get();

    for (Element row : document.select("WHAT DO I PUT HERE tr")); {
        final String Attack = row.select("WHAT DO I PUT HERE")
        final String Defence = row.select("WHAT DO I PUT HERE")
        final String Strength = row.select("WHAT DO I PUT HERE")
    }

Just want to output the row, or individual skills to the console.只想输出行,或者个人技能到控制台。 Any help would be much much appreciated.任何帮助将不胜感激。

I would recommend using the official API if you want to get the data you're looking for easily.如果您想轻松获取所需数据,我建议您使用官方 API。 Using this link: https://secure.runescape.com/m=hiscore_oldschool/index_lite.ws?player= To do that with Jsoup in a hacky kind of way would look a little like this...使用此链接: https://secure.runescape.com/m=hiscore_oldschool/index_lite.ws?player= ://secure.runescape.com/m=hiscore_oldschool/index_lite.ws?player https://secure.runescape.com/m=hiscore_oldschool/index_lite.ws?player=用 Jsoup 以一种骇人听闻的方式做到这一点看起来有点像这样......

    final Document document = Jsoup.connect("https://secure.runescape.com/m=hiscore_oldschool/index_lite.ws?player=Feed%20Meh%20Dog").get();
    final Element body = document.selectFirst("body");

    String[] rawSkills = body.html().split(" ");
    ArrayList<String[]> skills = new ArrayList<>();

    for(String s: rawSkills ) {
        skills.add(s.split(","));
    }

    System.out.println(skills.get(0)[1]);

Then to select an individual skill, you would do something like skills.get(x)[y] with x being the index in the array the skill is (starting 0) and y being which piece of information from the skill you want.然后要选择一项个人技能,您可以执行类似skills.get(x)[y]其中 x 是该技能在数组中的索引(从 0 开始),y 是您想要的技能信息。 0 being rank, 1 being skill level and 2 being xp. 0 是等级,1 是技能等级,2 是经验。

The API doesn't provide names of each skill so you would have to do that manually. API 不提供每项技能的名称,因此您必须手动执行此操作。 The skill order is as it is on the high scores page here .技能顺序与此处的高分页面相同。

EDIT: I've taken the liberty to create a small Java wrapper for this specific endpoint which you can find here .编辑:我冒昧地为这个特定的端点创建了一个小的 Java 包装器,你可以在这里找到。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM