简体   繁体   English

基于Javascript的动态内容使用htmlUnit

[英]Javascript based dynamic content using htmlUnit

I have been stuck in getting JavaScript based dynamic content using HtmlUnit. 我一直坚持使用HtmlUnit获取基于JavaScript的动态内容。 I am expecting to get (Signin, Registration html content) from the page. 我期待从页面获得(Signin,注册html内容)。 With the following code, I only get the static content. 使用以下代码,我只获取静态内容。

I am new to HtmlUnit. 我是HtmlUnit的新手。 Any help will be highly appreciated. 任何帮助将受到高度赞赏。

String strURL = "https://www.checkmytrip.com" ;
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);

final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.waitForBackgroundJavaScript(60 * 1000);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());

HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL));

String theContent = myPage.getWebResponse().getContentAsString();
System.out.println(theContent);      

Two points: 两点:

  1. You need to waitForBackgroundJavaScript() after you get the page, as hinted here 在获得页面后,您需要waitForBackgroundJavaScript(),如此处所示
  2. You should use myPage.asText() or .asXml() instead, because getWebResponse() returns the original content without JavaScript execution. 您应该使用myPage.asText()或.asXml(),因为getWebResponse()返回原始内容而不执行JavaScript。

     String strURL = "https://www.checkmytrip.com" ; java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF); java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF); try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31)) { webClient.setAjaxController(new NicelyResynchronizingAjaxController()); HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL)); webClient.waitForBackgroundJavaScript(10 * 1000); String theContent = myPage.asXml(); System.out.println(theContent); } 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM