简体   繁体   English

Jsoup,解析HTML加载的Ajax表

[英]Jsoup, parse html loaded ajax table

I have a problem with the table parsing, which is loaded with ajax: 我对使用ajax加载的表解析有问题:

Document doc = Jsoup.connect("http://lfl.ru/club553").get();

That's what i get: 那就是我得到的:

<div class="tournament_stats_table_tournament_3442 tournament-stats-table" style="display: block;" url="/?ajax=1&amp;method=tournament_stats_table&amp;tournament_id=3442&amp;club_id=553">
                        подождите...                    </div>

Tell me please what can be done in this situation. 请告诉我在这种情况下可以做什么。 Thx. 谢谢。

You won't be able to get data using server->server request. 您将无法使用server-> server request获取数据。 When you do your request on server JS won't be executed and therefore table is simply not available. 在服务器上执行请求时,将不会执行JS,因此表根本不可用。

As alternative think about these: 可以考虑以下这些:

  1. If you are the owner of website you parse - try to avoid ajax if possible; 如果您是要解析的网站的所有者-尽可能避免使用Ajax;
  2. Check what is endpoint of ajax request and simply parse it instead of webpage; 检查ajax请求的终结点是什么,只需解析它而不是网页即可;

First: you need get Cookies ,just using 首先:您需要使用Cookie,只需使用

 Connection.Response = Jsoup.connect.method(get).excute 

Then: make request with cookies and headers 然后:使用Cookie和标头进行请求

document=Jsoup.connect.header.data.post

eg 例如

Connection.Response loginForm=Jsoup.connect("http://www.a5.cn").
        method(Connection.Method.GET).execute();

Document document=Jsoup.connect("http://www.a5.cn/login.html").
        data("formhash","97bfbf").data("hdn_refer","http://www.a5.cn/")
        data("account","userID").data("autoLogin","1").data("password","your password").
        cookies(loginForm.cookies()).header("Accept","application/json, text/javascript, */*; q=0.01").header("X-Requested-With","XMLHttpRequest").post();

System.out.println(document.body().text());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM