簡體   English   中英

使用JSoup獲取被onclick按鈕javascript隱藏的表的內容

[英]Getting the content of a table that being hidden by an onclick button javascript using JSoup

我正在創建一個用於游戲的個人網頁抓取工具。 這是我要抓取的網站: http : //forum.toribash.com/clan_war.php? clanid= 139

我想計算出現在“顯示詳細信息”上的名稱的頻率。

我已經閱讀了從javascript onClick超鏈接獲取內容,卻不知道這是否是我真正要搜索的內容。 我懷疑這不是我要尋找的東西,但是無論我是否還沒有嘗試該問題的答案,因為我不知道如何使此https://stackoverflow.com/a/12268561/10467473適合我想要的是。

        BufferedReader month = new BufferedReader(new InputStreamReader(System.in));
        String mth = month.readLine();
        //Accessing the website
        Document docs = Jsoup.connect("http://forum.toribash.com/clan_war.php?clanid=139").get();

        //Taking every entry of war history
        Elements collection = docs.getElementsByClass("war_history_entry");
        //Itterate every collection
        for(Element e : collection){
            //if the info is on the exact month that are being searched we will use the e
            if(e.getElementsByClass("war_info").text().split(" ")[1].equalsIgnoreCase(mth)){
                //supposedly it holds every element that has player as it class inside of the button onclick
                //But it doesn't work
                Elements cek = e.getElementsByClass("player");
                for(Element c : cek){
                    System.out.println(c.text());
                }
            }

現在,我希望至少在顯示詳細信息表上獲得名稱

Kaito
Chax
Draku

等等

此頁面不包含您要抓取的信息。 單擊按鈕后,結果將由AJAX(Javascript)加載。 您可以使用Web瀏覽器的調試器在“網絡”選項卡上查看,以查看單擊按鈕后會發生什么。 點擊一個按鈕

<button id="buttonwarid19557"  ... >

從URL加載表:

http://forum.toribash.com/clan_war_ajax.php?warid=19557&clanid=139

注意相同的ID號。

您要做的就是從每個按鈕獲取ID,然后為每個按鈕獲取另一個文檔,然后逐個解析它。 無論如何,這就是您的網絡瀏覽器所做的。

        BufferedReader month = new BufferedReader(new InputStreamReader(System.in));
        String mth = month.readLine();
        //Accessing the website
        Document docs = Jsoup.connect("http://forum.toribash.com/clan_war.php?clanid=139").get();

        //Taking every entry of war history
        Elements collection = docs.getElementsByClass("war_history_entry");
        //Itterate every collection
        for(Element e : collection){
            //if the info is on the exact month that are being searched we will use the e
            if(e.getElementsByClass("war_info").text().split(" ")[1].equalsIgnoreCase(mth)){
                // selecting button
                Element button = e.selectFirst("button");
                // getting warid from button id
                String buttonId = button.attr("id");
                // removing text because we need only number
                String warId = buttonId.replace("buttonwarid", "");

                System.out.println("downloading results for " + e.getElementsByClass("war_info").text());
                // downloading and parsing subpage containing table with info about single war
                // adding referrer to make the request look more like it comes from the real web browser to avoid possible hotlinking protection
                Document table = Jsoup.connect("http://forum.toribash.com/clan_war_ajax.php?warid=" + warId + "&clanid=139").referrer("http://forum.toribash.com/clan_war.php?clanid=139").get();
                // get every <td class="player"> ... </td>
                Elements players = table.select(".player");
                for(Element player : players){
                    System.out.println(player.text());
                }
            }
        }

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM