简体   繁体   中英

Jsoup, parse html loaded ajax table

I have a problem with the table parsing, which is loaded with ajax:

Document doc = Jsoup.connect("http://lfl.ru/club553").get();

That's what i get:

<div class="tournament_stats_table_tournament_3442 tournament-stats-table" style="display: block;" url="/?ajax=1&amp;method=tournament_stats_table&amp;tournament_id=3442&amp;club_id=553">
                        подождите...                    </div>

Tell me please what can be done in this situation. Thx.

You won't be able to get data using server->server request. When you do your request on server JS won't be executed and therefore table is simply not available.

As alternative think about these:

  1. If you are the owner of website you parse - try to avoid ajax if possible;
  2. Check what is endpoint of ajax request and simply parse it instead of webpage;

First: you need get Cookies ,just using

 Connection.Response = Jsoup.connect.method(get).excute 

Then: make request with cookies and headers

document=Jsoup.connect.header.data.post

eg

Connection.Response loginForm=Jsoup.connect("http://www.a5.cn").
        method(Connection.Method.GET).execute();

Document document=Jsoup.connect("http://www.a5.cn/login.html").
        data("formhash","97bfbf").data("hdn_refer","http://www.a5.cn/")
        data("account","userID").data("autoLogin","1").data("password","your password").
        cookies(loginForm.cookies()).header("Accept","application/json, text/javascript, */*; q=0.01").header("X-Requested-With","XMLHttpRequest").post();

System.out.println(document.body().text());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM