[英]How can i extract HTML code from secure URLs (https://)?
I used JSoup library im not getting full html content from secure URLs (https://) as the URL will get the content dynamically. 我使用JSoup库不是从安全URL(https://)获取完整的html内容,因为URL将动态获取内容。 Is there any way to get the exact HTML content of secured URLs(https://). 有什么方法可以获取受保护的URL(https://)的确切HTML内容。
In order to parse the complete content you can use Selenium along with Jsoup. 为了解析完整的内容,您可以将Selenium与Jsoup一起使用。
WebDriver driver = new ChromeDriver();
driver.get("https://google.com/");
Document doc = Jsoup.parse(driver.getPageSource());
Or you can also wait for specific content to load.Shown below: 或者您也可以等待特定内容加载。如下所示:
public void waitForLoad(WebDriver driver) {
ExpectedCondition<Boolean> pageLoadCondition = new
ExpectedCondition<Boolean>() {
public Boolean apply(WebDriver driver) {
return ((JavascriptExecutor)driver).executeScript("return document.readyState").equals("complete");
}
};
WebDriverWait wait = new WebDriverWait(driver, 30);
wait.until(pageLoadCondition);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.