简体   繁体   中英

Htmlunit : How to get page updated after ajax dom manipulation

Using HtmlUnit 2.15, we are trying to scrape a third party website. Therein is a textbox, which onblur calls a javascript function, which adds an option to a select box on the same page.

With Htmlunit, I am able to successfully fire the onblur event, but how do I get handle to the "changed" page, which has the newly added option element?

Code snippet:

final HtmlPage page = webClient.getPage(myUrl);

HtmlSelect selectDropDown = (HtmlSelect)page.getElementByName(selectname);
List<HtmlOption> options = clickThis.getOptions(); // returns 4 options 

HtmlTextInput myTextBox = page.getElementByName(textboxname);
myTextBox.setValueAttribute("myText");
myTextBox.fireEvent(Event.TYPE_BLUR);

// Now how do I get the "updated" page? It should have 5 options

You need to wait until the javascript has changed the page. My experience is that this may take a while. Especially if calling of the server is part of it.

So far My attempt is to poll the page until something changed the way I expect it to.

Here is a method that waits for a given text to appear on the page;

   private static final int AJAX_MAX_TRIES_SECONDS = 30;
   /**
     * Waits until the given 'text' appeared or throws an
     * WaitingForAjaxTimeoutException if the 'text' does not appear before we timeout.
     * @param page
     * @param text The text which indicates that ajax has finished updating the page
     * @param waitingLogMessage Text for the log-output. Should indicate where in the code we are, and what are we waiting for
     * @throws WaitingForAjaxTimeoutException
     */
    public static void waitForAjaxCallWaitUntilTextAppears(//
            @Nonnull final HtmlPage page, //
            @Nonnull final String text,//
            @Nonnull final String waitingLogMessage) throws WaitingForAjaxTimeoutException {
        LOGGER.debug("_5fd3fc9247_ waiting for ajax call to complete ... [" + waitingLogMessage + "]");
        final StringBuilder waitingdots = new StringBuilder("   ");
        for (int i = 0; i < AJAX_MAX_TRIES_SECONDS; i++) {

            if (page.asText().contains(text)) {
                waitingdots.append(" ajax has finished ['").append(text).append("' appeared]");
                LOGGER.debug("_8cd5a34faf_ " + waitingdots);
                return;
            }
            waitingdots.append('.');
            wait(page);
        }
        LOGGER.debug("_de5091bc9e_ "
                + waitingdots.append(" ajax timeout ['").append(text).append("' appeared NOT]").toString());
        LOGGER.debug("_f1030addf1_ page source:\n" + page.asXml());
        throw new WaitingForAjaxTimeoutException();
    }

Also make sure that javascript is enabled. (which is the default):

webClient.getOptions().setJavaScriptEnabled(true);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM