I want to get data from this web site with web scraping. http://myservices.ect.nl/tracing/objectstatus/Pages/Overview.aspx :
I used JSoup before for more static HTML sites, but this one is more difficult because before I get the HTML table on the site have to click one button and I don't know if it's possible to use JSoup to manipulate the button.
After click this button I get a HTML table, I want to get data only where modality is Barge.
Thank you for your tip to use Firefox, now I have the table with some another page information. Can you tell me how can i get only table information? Output that I get is as follows:
You will have to use Selenium
HTML Unit Driver for that.
Here is full working example
. It will visit the website
, click
the button and then you can get the data
from the page.
Edit: Only get the table value
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.Select;
public class GetData {
public static void main(String args[]) throws InterruptedException {
WebDriver driver = new FirefoxDriver();
driver.get("http://myservices.ect.nl/tracing/objectstatus/Pages/Overview.aspx");
Thread.sleep(5000);
// select barge
new Select(driver.findElement(By.id("ctl00_ctl15_g_ce17bd4b_3803_47f6_822a_2b8dd10fc67d_ctl00_dlModality"))).selectByVisibleText("Barge");
// click button
Thread.sleep(3000);
driver.findElement(By.className("button80")).click();
Thread.sleep(5000);
//get only table text
WebElement findElement = driver.findElement(By.className("grid-view"));
String htmlTableText = findElement.getText();
// do whatever you want now, These are raw table values.
System.out.println(htmlTableText);
driver.close();
driver.quit();
}
}
Every "click" (or any interaction of that sort) is a request to the server and a response to the browser. So, a possible solution is not to use JSoup for the initial page, but for the result page. For instance, open a POST to the page that returns the table, passing the parameter responsible for returning the modality Barge
. You can use a tool like Firebug (for Firefox) or Chrome Developer Tools to check what's the conversation (request/response), so that you can emulate that with your own code.
Maybe browser emulator for java will be useful for your problem - please consider this one - HtmlUnit.
It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.