简体   繁体   English

URL中存在分页时如何导航到其他页面

[英]how to navigate to other pages when pagination exists in the URL

I have a URL( http://myURL.com ) from which I'm reading the content of the webpage. 我有一个URL( http://myURL.com ),正在从中读取网页的内容。 An issue is I can able to read the page1 content only. 问题是我只能阅读page1内容。 Using jsoup API when the page2 content is read given the page2 URL of the pagination pages, still, it is showing the content of page1 when printed instead of showing page2 content, but when the page2 URL is opened in the browser it is showing the contents of page2 in a web browser. 在给定分页页面的page2 URL的情况下读取page2内容时仍使用jsoup API,但它在打印时显示的是page1的内容,而不是显示page2的内容,但是在浏览器中打开page2 URL时,它显示的是内容Web浏览器中page2的大小。 Any suggestions on how to read the contents of other pages when the pagination occurs? 关于出现分页时如何阅读其他页面内容的任何建议?

Original URL : 原始网址:

http://myURL.com/myDocs/forms/AllItems.aspx?RootFolder=%2fsites%2docs3%2fmiscc%20Documents%2fstatus%20yearly%2f2017&FolderCTID=0x012906D46689EQWEPKA

URL of page2 : (After clicking on the next button to see page2 of the pagination pages) : 第2页的网址:(单击下一个按钮以查看分页页面的第2页):

http://myURL.com/myDocs/forms/AllItems.aspx?RootFolder=%2fsites%2docs3%2fmiscc%20Documents%2fstatus%20yearly%2f2017&FolderCTID=0x012906D46689EQWEPKA #InplviewHash038662ba-180e-41fc-8ad6-8b9805aa1b8b=Paged%3DTRUE-p_SortBehavior%3D0-p_FileLeafRef%3DGM%255fSW%2520TEAM%255fProgram%255fStatus%255f20170821%255fvFNAL%252epdf-p_ID%3D85-PageFirstRow%3D31-RootFolder%3D%252fsites%252fijjhhj3%252fyeal%2520Documents%252fstatus%2520Report%252f2017

java code: Java代码:

 public class Tester {
        private static final String page1URL = "http://myURL.com/myDocs/forms/AllItems.aspx?RootFolder=%2fsites%2docs3%2fmiscc%20Documents%2fstatus%20yearly%2f2017&FolderCTID=0x012906D46689EQWEPKA";

    private String final String page2URL= "http://myURL.com/myDocs/forms/AllItems.aspx?RootFolder=%2fsites%2docs3%2fmiscc%20Documents%2fstatus%20yearly%2f2017&FolderCTID=0x012906D46689EQWEPKA#InplviewHash038662ba-180e-41fc-8ad6-8b9805aa1b8b=Paged%3DTRUE-p_SortBehavior%3D0-p_FileLeafRef%3DGM%255fSW%2520TEAM%255fProgram%255fStatus%255f20170821%255fvFNAL%252epdf-p_ID%3D85-PageFirstRow%3D31-RootFolder%3D%252fsites%252fijjhhj3%252fyeal%2520Documents%252fstatus%2520Report%252f2017";
      public static void main(String[] args) throws IOException {
            org.jsoup.nodes.Document doc = Jsoup.connect(page1URL).get();
            System.out.println(doc);
    }  }

In the above code, when I pass page2URL also, it is showing the contents of page1 only but when opened in the browser it is showing the page2 contents. 在上面的代码中,当我也传递page2URL时,它仅显示page1的内容,但是在浏览器中打开时,它显示的是page2的内容。 Is it because page2URL is the URL occurred when clicked on Next button in page1(pagination)? 是因为page2URL是单击page1(分页)中的“下一步”按钮时出现的URL吗?

ps: page2URL is same as page1URL but with extra appenders (#InplviewHash03....), please compare both URLs to know the difference. ps:page2URL与page1URL相同,但带有附加的附加程序(#InplviewHash03 ....),请比较两个URL来了解区别。

I suggest reading up on the meaning of # in an URL. 我建议阅读URL中#的含义。 It was originally meant as anchor within a page so that the browser could jump to the display of that element right away. 它最初的目的是作为页面内的锚点,以便浏览器可以立即跳转到该元素的显示。 These days it is used for AJAX, because it is possible to read out the parameter via JavaScript. 如今,它已用于AJAX,因为可以通过JavaScript读取参数。 FOr reference see What is the meaning of # in URL and how can i use that? 有关参考,请参阅URL中的#是什么意思,该如何使用?

This means your website contains JavaScript that loads the contents of page 2 after getting the original content via JavaScript. 这意味着您的网站包含JavaScript,该JavaScript在通过JavaScript获取原始内容后加载第2页的内容。 As I explained you before in the question you removed, JSoup will not run JavaScript, so you are still required of identifying the AJAX call and getting the real parameters of that call. 正如我之前在删除的问题中向您解释的那样,JSoup将不会运行JavaScript,因此仍然需要您识别AJAX调用并获取该调用的真实参数。 When you have this, you can access the contents of page 2. 有了这个,您可以访问第2页的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM