简体   繁体   English

如何使用htmlunit从ajax / javascript网站提取数据? 我正在尝试提取运输历史记录

[英]How to extract data from ajax/javascript websites using htmlunit? I m trying to extract shipment history

I m trying to extract shipment history from this page http://www.aramex.com/express/track-results.aspx?q=aWQ9MzU2NDQ4MTQ3Jg%3d%3d-ULINyZQtKrw%3d . 我正在尝试从此页面http://www.aramex.com/express/track-results.aspx?q=aWQ9MzU2NDQ4MTQ3Jg%3d%3d-ULINyZQtKrw%3d中提取装运历史记录。

This my code: 这是我的代码:

public void aramexTracking() {
    WebClient webClient  = new WebClient(BrowserVersion.CHROME);
    String trackingId = "9181468833";       
    HtmlPage page1, page2;

            try {

            page1 = webClient.getPage("http://www.aramex.com/express/track.aspx");


                                     webClient.getOptions().setThrowExceptionOnScriptError(false);

                               webClient.getOptions().setPrintContentOnFailingStatusCode(false);


           webClient.setCssErrorHandler(new com.gargoylesoftware.htmlunit.SilentCssErrorHandler());



                //Submitting form on Tracking Page
                HtmlForm form = page1.getFormByName("aspnetForm");

                HtmlButtonInput button =  form.getInputByName("ctl00$ctl00$MainContent$InnerMainContent$btnGo");

                HtmlTextArea textArea = form.getTextAreaByName("ShipmentNumber");
                textArea.setText(trackingId);

                page2 = button.click();

                List<?> list = page2.getByXPath("//div[@id='dvSearchResults']/text()");



            } catch (FailingHttpStatusCodeException | IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }       
}

Please, post a valid tracking number. 请发布一个有效的跟踪号码。 I tried random one - 3974937493 and want to suggest another xpath: 我尝试了随机-3974937493,并建议另一个xpath:

HtmlTable table = (HtmlTable) page2.getFirstByXPath("//div[@id='MainContent']//table//table");

After that, parse rows of the table as usual 之后,照常解析表中的行

if (table.getCellAt(1,0) != null) System.out.println(table.getCellAt(1,0).asText();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM