简体   繁体   中英

Obtaining all name-value pairs in a form using Jsoup

I want to automate posting of a number of HTML forms using Jsoup and HttpClient. Most of those forms have hidden fields (with session ids, etc.) or have default values that I'd rather leave alone.

Coding each of the form submissions individually -- extracting each of said hidden or default values from the page -- is extremely tedious, so I thought about writing a generic method to obtain the list of HTTP parameters for a given form.

It is not a trivial piece of code, though, because of the variety of input tags and field types, each of which may need specific handling (eg textareas, checkboxes, radio buttons, selects, ...) so I thought I'd first search/ask in case it already exists.

Note: Jsoup and HttpClient are a given; I can't change that -- so please no need to provide answers suggesting other solutions: I have a Jsoup Document object and I need to build an HttpClient HttpRequest.

So I've ended up writing it. I would still prefer to swap for something field-tested (and hopefully maintained elsewhere), but in case it helps anyone landing here...

Not thoroughly tested and without support for multipar/form-data, but works in the few examples I've tried:

  public void submit(String formSelector, List<String> params) {
    if (params.size() % 2 != 0) {
      throw new Exception("There must be an even number of params.");
    }

    Element form= $(formSelector).first();

    Set<String> newParams= Sets.newHashSet();
    for (int i=0; i < params.size(); i+= 2) {
      newParams.add(params.get(i));
    }

    List<String> allParams= Lists.newArrayList(params);
    for (Element field: form.select("input, select, textarea")) {
      String name= field.attr("name");
      if (name == null || newParams.contains(name)) continue;
      String type= field.attr("type").toLowerCase();
      if ("checkbox".equals(type) || "radio".equals(type)) {
        if (field.attr("checked") != null) {
          allParams.add(field.attr("name"));
          allParams.add(field.attr("value"));
        }
      }
      else if (! fieldTypesToIgnore.contains(type)) {
        allParams.add(field.attr("name"));
        allParams.add(field.val());
      }
    }

    String action= form.attr("abs:action");
    String method= form.attr("method").toLowerCase();
    // String encType= form.attr("enctype"); -- TODO

    if ("post".equals(method)) {
      post(action, allParams);
    }
    else {
      get(action, allParams);
    }
  }

($, get, and post are methods I already had lying around... you can easily guess what they do).

Jsoup has a formData method in the FormElement class; it works in simple cases, but it doesn't always do what I need, so I ended up writing some custom code too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM