[英]Javascript based dynamic content using htmlUnit
I have been stuck in getting JavaScript based dynamic content using HtmlUnit. 我一直坚持使用HtmlUnit获取基于JavaScript的动态内容。 I am expecting to get (Signin, Registration html content) from the page.
我期待从页面获得(Signin,注册html内容)。 With the following code, I only get the static content.
使用以下代码,我只获取静态内容。
I am new to HtmlUnit. 我是HtmlUnit的新手。 Any help will be highly appreciated.
任何帮助将受到高度赞赏。
String strURL = "https://www.checkmytrip.com" ;
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.waitForBackgroundJavaScript(60 * 1000);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL));
String theContent = myPage.getWebResponse().getContentAsString();
System.out.println(theContent);
Two points: 两点:
You should use myPage.asText() or .asXml() instead, because getWebResponse() returns the original content without JavaScript execution. 您应该使用myPage.asText()或.asXml(),因为getWebResponse()返回原始内容而不执行JavaScript。
String strURL = "https://www.checkmytrip.com" ; java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF); java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF); try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31)) { webClient.setAjaxController(new NicelyResynchronizingAjaxController()); HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL)); webClient.waitForBackgroundJavaScript(10 * 1000); String theContent = myPage.asXml(); System.out.println(theContent); }
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.