簡體   English   中英

我需要從網站上的動態表中提取數據,我想使用Jsoup(Java)

[英]I need to extract data from a dynamic table on a website, I want to use Jsoup (Java)

我正在使用Jsoup從網站上的表中提取數據。 該表的內容是動態的,當我單擊它時,有一個刷新按鈕可更新行。

我試圖用Jsoup推斷數據,但是就像我在分析HTML頁面時看不到表的行。 (解析方法)。 當我單擊刷新按鈕時,將調用JavaScript函數。

你有什么建議嗎? 我讀到Jsoup無法從HTML頁面提取動態值? 真正? 我是否必須使用其他圖書館? 這是從Chrome DevTools中提取的頁面的HTML代碼(該工具的“元素”部分)

這是我在“元素”部分中轉到Chrome DevTools時看到的內容:

要提取的數據

我要提取“黃色數據”。

這是我在“來源”部分轉到chrome開發人員工具時看到的內容:

<div id="datatable_wrapper" class="dataTables_wrapper form-inline dt-bootstrap no-footer" style="display:none;margin-bottom: 50px;min-height:585px">
  <div class="row" style="margin:0px:padding:0px">
    <div class="col-sm-12">
      <table id="datatable" class="table table-striped table-bordered table-hover dataTable no-footer" style="width: 1100px;margin:0px !important;padding:0px !important;" role="grid" aria-describedby="datatable_info">
        <thead>
          <tr role="row">
            <th class="sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Event Date/Time: activate to sort column ascending" style="width: 100px;">TITLE_COLUMN1</th>
            <th class="hCenter sorting_disabled" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" style="width: 40px;"></th>
            <th class="sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Event Name: activate to sort column ascending" style="width: 250px;">TITLE_COLUMN2</th>
            <th class="sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Bet: activate to sort column ascending" style="width: 100px;">TITLE_COLUMN3</th>
            <th class="hCenter sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Rating: activate to sort column" style="width: 80px;">TITLE_COLUMN4(%)</th>
            <th class="hCenter sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="SNR Rating: activate to sort column" style="width: 95px;">TITLE_COLUMN5(%)</th>
            <th class="hCenter sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Bookie: activate to sort column ascending" style="width: 100px;">TITLE_COLUMN6</th>
            <th class="hCenter sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Back Odds: activate to sort column ascending" style="width:40px;">TITLE_COLUMN7</th>
            <th class="hCenter" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Exchange: activate to sort column ascending" style="width: 100px;">TITLE_COLUMN8</th>
            <th class="hCenter sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Lay Odds: activate to sort column ascending" style="width: 40px;">TITLE_COLUMN9</th>
            <th class="sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Availibity: activate to sort column ascending" style="width: 50px;">TITLE_COLUMN10</th>
            <th class="sorting" tabindex="0" aria-controls="datatable" rowspan="1" colspan="1" aria-label="Availibity: activate to sort column ascending" style="width: 50px;">
              <img src="/clock.png" width="15px" style="margin-left:10px;" />
            </th>
          </tr>
        </thead>
        <tbody>
        </tbody>
      </table>
    </div>
  </div>

這是填充表格的javascript函數:

function getData(ratingFrom,ratingTo,oddsFrom,oddsTo,availability,sortColumn=5,sortDirection="desc",offset=0,bookies="",filterbookies="c81e728d9d4c2f636f067f89cc14862c",eventname="",dateFrom="",dateTo="",exchange="",exchanges="",sport="all"){

$("#datatable_processing").css("display","block");
$("#datatable_nodata").css("display","none");
var importo_puntata = parseFloat($("#settings-form input[name=importo-puntata]").val());
var importo_bonus_rimborso = parseFloat($("#settings-form input[name=importo-bonus-rimborso]").val());

$.post("/get_data.php", {"refund":importo_bonus_rimborso,"back_stake":importo_puntata,"name":eventname,"filterbookies":filterbookies,"bookies":bookies,"rating-from":ratingFrom,"rating-to":ratingTo,"odds-from":oddsFrom,"odds-to":oddsTo,"min-liquidity":availability,"sort-column":sortColumn,"sort-direction":sortDirection,"offset":offset,"date-from":dateFrom,"date-to":dateTo,"exchange":exchange,"exchanges":exchanges,'sport':sport}, function(data){
    var allData = jQuery.parseJSON(data);
    var paginationHtml = "<li class=\"paginate_button previous disabled\" aria-controls=\"datatable\" tabindex=\"0\" id=\"datatable_previous\"><a href=\"#\">Precedente</a></li>";
    paginationHtml+= "<li class=\"paginate_button next\" aria-controls=\"datatable\" tabindex=\"0\" id=\"datatable_next\"><a href=\"#\">Seguente</a></li>";
    $("ul.pagination").html(paginationHtml);

    if(allData.data.length>0)
    {    
        if(allData.bookmakers.length>0)
        {
            if(allData.bookmakers.length>1)
            {
                var html =  '<option value="all">Tutti i Bookmakers</option>';
                $.each(allData.bookmakers, function(i, item) {
                    if(filterbookies == item.id.toString())
                        html  += '<option value="'+item.id.toString()+'" selected="selected">'+capitalizeFirstLetter(item.name)+'</option>';
                    else    
                        html  += '<option value="'+item.id.toString()+'">'+capitalizeFirstLetter(item.name)+'</option>';
                });

                $("#bookmaker").css("display","inline-block");
                $("#bookmaker").html(html);
            }
            else if(allData.bookmakers.length==1)
            {
                var html = '<option value="'+allData.bookmakers[0].id.toString()+'" selected="selected">'+capitalizeFirstLetter(allData.bookmakers[0].name)+'</option>';
                $("#bookmaker").html(html);
            }
        }

        if(allData.exchanges.length>1 && allData.bookmakers.length >1)
        {
                var html =  '<option value="all">Tutti gli Exchanges</option>';
                $.each(allData.exchanges, function(i, item) {
                        html  += '<option value="'+item.toString()+'">'+capitalizeFirstLetter(item.toString())+'</option>';
                });

                $("#exchange").css("display","inline-block");
                $("#exchange").html(html);
        }
        else if ($("input[name=exchanges]").val() == "all")
        {
                $("#exchange").css("display","inline-block");
        }


        $("#datatable tbody").html("");
        var html = "";



        $.each(allData.data, function(i, item) {
            var json = allData.data[i];
            var redRating = "";
            if(json.rating>=100)
                redRating= " redrating ";

            html  +=  "<tr role=\"row\" style=\"background-color: #fff !important;\" back-odds=\""+json.back_odds+"\" lay-odds=\""+json.lay_odds+"\" competition=\""+json.competition+"\" country=\""+json.country_code+"\" exchange=\""+json.exchange+"\" >"+
                       "<td>"+json.opendate+"</td>"+
                       "<td class=\"hCenter\"><img src=\"/images/"+json.sport+".png\" /></td>"+
                       "<td>"+json.event_name+"</td>"+
                       "<td>"+json.bet+"</td>"+
                       "<td class=\"hCenter sorting_1\"><span class=\"rating"+redRating+"\">"+json.rating+"</span></td>"+
                       "<td class=\"hCenter sorting_1\"><span class=\"snrrating\">"+json.snr_rating.toString()+".00"+"</span>"+
                       "<img src=\"/images/calculator.png\" class=\"calculator\""+
                       "  attrib-sport=\""+json.sport+"\" attrib-exchange=\""+json.exchange+"\" attrib-competition=\""+json.competition+"\" attrib-country=\""+json.country_code+"\" attrib-eventdate=\""+json.opendate+"\" attrib-eventname=\""+json.event_name+"\" attrib-bet=\""+json.bet+"\" attrib-market=\""+json.market_type+"\" attrib-rating=\""+json.rating+"\" attrib-odds-provider_id=\""+json.odds_provider_fk+"\"  attrib-odds-provider=\""+capitalizeFirstLetter(json.odds_provider)+"\" attrib-back-odds=\""+json.back_odds+"\" attrib-lay-odds=\""+json.lay_odds+"\" attrib-availability=\""+json.availability+"\" attrib-bookie-bet-url=\""+json.bookie_bet_url+"\" attrib-betfair-bet-url=\""+json.betfair_bet_url+"\" "+
                       " /></td>"+
                       "<td class=\" hCenter\"><img src=\"/images/"+json.odds_provider_fk+".png\" width=\"80\" /></td>"+
                       "<td class=\" hCenter back\" ><span>"+json.back_odds+"</span></td>"+
                       "<td class=\" hCenter\" ><img src=\"/images/"+json.exchange+".png\" width=\"80\" /></td>"+
                       "<td class=\" hCenter lay\" ><span>"+json.lay_odds+"</span></td>"+
                       "<td>&nbsp; &#8364;"+json.availability+"</td>"+
                       "<td>"+json.update_time.toString()+"</td>"+
                    "</tr>";
        });



        var allEvents = parseInt(allData.allEventsCount);
        if(allEvents>10)
        {    
            allEvents = allEvents - 10;
            var j=0;
            var pageStart = parseInt(allData.offset);
            if(pageStart<9)
                pageStart=1;
                else
                pageStart-=4;

            var pageHtml = "";
            for(var i=pageStart;i<=parseInt(allEvents/10);i++)
            {
                var current = "";
                if((i-1==parseInt(allData.offset) && pageStart!=1) || (pageStart==1 && i-1==allData.offset))
                    current = "paginate_current disabled";

                pageHtml += "<li class=\"paginate_button paginate "+current+"\" aria-controls=\"datatable\" tabindex=\"0\"><a href=\"\" class=\"paginate\">"+i.toString()+"</a></li>";

                j++;
                if(j>9)
                    break;
            }

            $("#datatable_previous").after(pageHtml);
        }

        $("#datatable_previous").click(function(event){
            if($(this).hasClass("disabled"))
                {event.preventDefault();return;}
            var sortColumn = $("#search-form input[name=sort-column]").val();
            var sortDirection = $("#search-form input[name=sort-direction]").val();

            var ratingFrom = $("#search-form input[name=rating-from]").val();
            var ratingTo=$("#search-form input[name=rating-to]").val();
            var oddsFrom=$("#search-form input[name=odds-from]").val();
            var oddsTo=$("#search-form input[name=odds-to]").val();
            var availability=$("#search-form input[name=availability]").val();
            var offset=parseInt($("#search-form input[name=offset]").val())-1;
            var bookies = $("#search-form input[name=bookies]").val();
            var filterbookies = $("#bookmaker").val();
             var dateFrom = $("#date-from").val();
                    var dateTo = $("#date-to").val();
            var teamname = $("#event-name").val();
             var exchange = $("#exchange").val();
             var exchanges = $("#search-form input[name=exchanges]").val();
            var sport = $("#sport").val();
                 getData(ratingFrom,ratingTo,oddsFrom,oddsTo,availability,sortColumn,sortDirection,offset,bookies,filterbookies,teamname,dateFrom,dateTo,exchange,exchanges,sport);

        });

        $("#datatable_next").click(function(event){
            if($(this).hasClass("disabled"))
                {event.preventDefault();return;}

            var sortColumn = $("#search-form input[name=sort-column]").val();
            var sortDirection = $("#search-form input[name=sort-direction]").val();

            var ratingFrom = $("#search-form input[name=rating-from]").val();
            var ratingTo=$("#search-form input[name=rating-to]").val();
            var oddsFrom=$("#search-form input[name=odds-from]").val();
            var oddsTo=$("#search-form input[name=odds-to]").val();
            var availability=$("#search-form input[name=availability]").val();
            var offset=parseInt($("#search-form input[name=offset]").val())+1;
            var bookies = $("#search-form input[name=bookies]").val();
            var filterbookies = $("#bookmaker").val();
            var dateFrom = $("#date-from").val();
                    var dateTo = $("#date-to").val();
            var teamname = $("#event-name").val();
             var exchange = $("#exchange").val();
             var exchanges = $("#search-form input[name=exchanges]").val();
            var sport = $("#sport").val();
                 getData(ratingFrom,ratingTo,oddsFrom,oddsTo,availability,sortColumn,sortDirection,offset,bookies,filterbookies,teamname,dateFrom,dateTo,exchange,exchanges,sport);

        });

        $("ul.pagination>li.paginate>a.paginate").click(function(event){
            event.preventDefault();
            var offset = parseInt($(this).html())-1;
            var sortColumn = $("#search-form input[name=sort-column]").val();
            var sortDirection = $("#search-form input[name=sort-direction]").val();

            var ratingFrom = $("#search-form input[name=rating-from]").val();
            var ratingTo=$("#search-form input[name=rating-to]").val();
            var oddsFrom=$("#search-form input[name=odds-from]").val();
            var oddsTo=$("#search-form input[name=odds-to]").val();
            var availability=$("#search-form input[name=availability]").val();
            var bookies = $("#search-form input[name=bookies]").val();
            var filterbookies = $("#bookmaker").val();
            var teamname = $("#event-name").val();
             var dateFrom = $("#date-from").val();
                    var dateTo = $("#date-to").val();
             var exchange = $("#exchange").val();
             var exchanges = $("#search-form input[name=exchanges]").val();
            var sport = $("#sport").val();
                 getData(ratingFrom,ratingTo,oddsFrom,oddsTo,availability,sortColumn,sortDirection,offset,bookies,filterbookies,teamname,dateFrom,dateTo,exchange,exchanges,sport);
        });

        $("#datatable_wrapper .pageNumber").html((parseInt(allData.offset)*10)+1);
        $("#datatable_wrapper .pageCount").html(allData.data.length+(parseInt(allData.offset)*10));
        $("#datatable_wrapper .allEventsCount").html(allData.allEventsCount);

        if((allData.data.length+(parseInt(allData.offset)*10)) < allData.allEventsCount)
            $("#datatable_next").removeClass("disabled");
        else
        {
            $("#datatable_next").removeClass("disabled");
            $("#datatable_next").addClass("disabled");
        }

        if(parseInt(allData.offset)>=1)
            $("#datatable_previous").removeClass("disabled");
        else
        {
            $("#datatable_previous").removeClass("disabled");
            $("#datatable_previous").addClass("disabled");
        }

        $("#datatable tbody").html(html);
        $("#datatable_wrapper").css("display","block");
        $("#datatable th").removeClass("sorting_desc");
        $("#datatable th").removeClass("sorting_asc");
        $("#datatable th").removeClass("sorting");
        $("#datatable th").addClass("sorting");
        $("#datatable th").removeAttr("aria-sort");

        $("#search-form input[name=offset]").val(allData.offset);
        $("#search-form input[name=sort-column]").val(allData.sortColumn);
        $("#search-form input[name=sort-direction]").val(allData.sortDirection);

        $($("#datatable th")[allData.sortColumn]).attr('aria-sort',allData.sortDirectionFull);
        $($("#datatable th")[allData.sortColumn]).removeClass("sorting");
        $($("#datatable th")[allData.sortColumn]).addClass("sorting_"+allData.sortDirection);    
        $("#datatable img.calculator").click(function(){

                $("#event-details #calc-event-datetime").val($(this).attr('attrib-eventdate'));
                $("#event-details #calc-event-name").val($(this).attr('attrib-eventname'));

                $("#event-details .event-rating").html($(this).attr('attrib-rating'));
                $("#event-details .event-competition").html($(this).attr('attrib-competition'));
                $("#event-details .event-country").html($(this).attr('attrib-country'));

                $("#right-container input[name=back-odds]").val($(this).attr('attrib-back-odds'));    
                $("#right-container a.bookie-bet-url").attr("href",$(this).attr('attrib-bookie-bet-url'));    

                $("#right-container a.betfair-bet-url").attr("href",$(this).attr('attrib-betfair-bet-url'));    
                if ($(this).attr('attrib-sport') == "tennis" && $(this).attr('attrib-exchange') == "betfair")
                    $("#right-container a.betfair-bet-url").attr("href",$(this).attr('attrib-betfair-bet-url').replace("football","tennis"));    
                $("#right-container input[name=back-commission]").val("0.00");
                $("#right-container input[name=lay-odds]").val($(this).attr('attrib-lay-odds'));    
                $("#right-container input[name=lay-commission]").val("0.05");
                $("#odds-container .event-outcome").html($(this).attr('attrib-bet')+" To Win");
                $("#odds-container span.backTitle").html($(this).attr('attrib-back-odds'));
                $("#odds-container span.layTitle").html($(this).attr('attrib-lay-odds'));
                $("#match-container img.exchangeLogo").attr("src","/images/"+$(this).attr('attrib-exchange')+".png");
                $("#match-container img.sport-img").attr("src","/images/"+$(this).attr('attrib-sport')+".png");
                $("span.event-exchange").html(capitalizeFirstLetter($(this).attr('attrib-exchange')));
                $("span.event-back-oddsprovider").html($(this).attr('attrib-odds-provider'));            
                $("#odds-container span.back-odds").html($(this).attr('attrib-back-odds'));
                $("#odds-container span.lay-odds").html($(this).attr('attrib-lay-odds'));

                $("#odds-container span.backTitle").html($(this).attr('attrib-back-odds'));
                $("#odds-container span.backTitle").html($(this).attr('attrib-back-odds'));

                $("#odds-container img.bookmakerLogo").attr("src","/images/"+$(this).attr('attrib-odds-provider_id').toString()+".png");

                $("#odds-container .event-back-outcome").html($(this).attr('attrib-bet'));
                $("#odds-container .event-lay-outcome").html($(this).attr('attrib-bet'));
                $("#odds-container span.lay-availability").html("&#8364;"+$(this).attr('attrib-availability').toString()+" liquidità");

                var importo_puntata = $("#settings-form input[name=importo-puntata]").val();
                var importo_bonus_rimborso = $("#settings-form input[name=importo-bonus-rimborso]").val();
                $("#right-container input[name=back-stake]").val(importo_puntata);
                $("#right-container input[name=back-refund-stake]").val(importo_bonus_rimborso);

                $("#match-container").dialog({
                    width: 850,
                    modal: true,
                    resizable: false,
                    title: "Calcolatore - "+$(this).attr('attrib-eventname'),
                    open: function(event, ui) {

                        $("#rbtnSNR").attr("checked",false);
                        $("#rbtnSR").attr("checked",false);
                        $("#rbtnNormal").attr("checked",true);
                        updateCalculator(0);
                        $("html, body").animate({ scrollTop: 100 }, "fast");
                    }
                }).parent().position({
                    my: 'top+50px',
                    at: 'top',
                    collision: "flip flip",
                    of: $("#datatable_wrapper")
                });
            });
    }
    else
    {
        if(allData.bookmakers.length==1)
        {
            $("#bookmaker").css("display","none");
        }

        $("#datatable_nodata").css("display","block");
        $("#datatable_nodata h3").html("<center>Nessun dato trovato</center>");
        $("#datatable_wrapper").css("display","none");
    }

    $("#datatable_processing").css("display","none");
});}

我讀到Jsoup無法從html頁面提取動態值? 真正?

真正。 您要查找的數據不在頁面源中。 由於POST到“ /get_data.php”而被動態讀取。 嘗試獲取該響應,因為它將包含JSON對象。 我建議使用一些JSON解析庫。

Jsoup在這里不是必需的,但可用於輕松獲取JSON數據:

String jsonResponse = Jsoup
    .connect(url + "/get_data.php")
    .method(Connection.Method.POST)
    .header("Accept", "application/json")
    .timeout(20000)
    .ignoreContentType(true)
    .maxBodySize(0)
    .requestBody("\"refund\":importo_bonus_rimborso,\"back_stake\":importo_puntata,\"name\":eventname,\"filterbookies\":filterbookies,\"bookies\":bookies,\"rating-from\":ratingFrom,\"rating-to\":ratingTo,\"odds-from\":oddsFrom,\"odds-to\":oddsTo,\"min-liquidity\":availability,\"sort-column\":sortColumn,\"sort-direction\":sortDirection,\"offset\":offset,\"date-from\":dateFrom,\"date-to\":dateTo,\"exchange\":exchange,\"exchanges\":exchanges,'sport':sport}")
    .execute().body();

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM