[英]Fetch ajax/javascript content using HTMLunit
I have written a code which will fetch me a html contents of the page as response , I am using HTML Unit to do so . 我已经编写了一个代码,它将获取页面的html内容作为响应,我正在使用HTML单元来执行此操作。 But I am getting error's for some specific urls like
但是我对某些特定的网址感到错误
[https://communities.netapp.com/welcome][1] [https://communities.netapp.com/welcome][1]
For first page i am able to retrieve the contents . 对于第一页,我能够检索内容。 But when i dont the content which we get using load more button .
但是,当我没有使用加载更多按钮的内容。
Here's my code: 这是我的代码:
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class Sample {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
String url = "https://communities.netapp.com/welcome";
WebClient client = new WebClient(BrowserVersion.INTERNET_EXPLORER_9);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setRedirectEnabled(true);
client.getOptions().setThrowExceptionOnScriptError(true);
client.getOptions().setCssEnabled(true);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = client.getPage(url);
Writer output = null;
String text = page.asText();
File file = new File("D://write6.txt");
output = new BufferedWriter(new FileWriter(file));
output.write(text);
output.close();
System.out.println("Your file has been written");
// System.out.println("as Text ==" +page.asText());
// System.out.println("asXML == " +page.asXml());
// System.out.println("text content ==" +page.getTextContent());
// System.out.println(page.getWebResponse().getContentAsString());
}
}
Any suggestion ? 有什么建议吗?
As i understand from your question you have a button on which you have to press. 据我所知,你有一个按钮,你必须按下。
Please look at: http://htmlunit.sourceforge.net/gettingStarted.html 请查看: http : //htmlunit.sourceforge.net/gettingStarted.html
You have there an example of submitting a form. 你有一个提交表格的例子。
This should be very similar here 这应该非常相似
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.