简体   繁体   中英

Java Apache HttpClient Submitting Form

I am trying to submit a form on this website , and get back the resulting misspellings from the text area as a string (only the "Reverse letters" checkbox should be selected). I have the code below, adapted from here :

private static void sendPost() throws Exception {
    String url = "http://tools.seobook.com/spelling/keywords-typos.cgi";
    HttpClient client = new DefaultHttpClient();
    HttpPost post = new HttpPost(url);
    post.setHeader("User-Agent", "Mozilla/5.0"); // add header
    List<NameValuePair> urlParameters = new ArrayList<NameValuePair>();

    //the input text area
    urlParameters.add(new BasicNameValuePair("user_input", "tomato potato"));   
    //the checkbox
    urlParameters.add(new BasicNameValuePair("reverse_letters", "reverse_letters")); 
    //the submit button (?)
    urlParameters.add(new BasicNameValuePair("", "generate typos"));

    post.setEntity(new UrlEncodedFormEntity(urlParameters));

    HttpResponse response = client.execute(post);
    System.out.println("\nSending 'POST' request to URL : " + url);
    System.out.println("Post parameters : " + post.getEntity());
    System.out.println("Response Code : " + 
            response.getStatusLine().getStatusCode());

    BufferedReader rd = new BufferedReader(new InputStreamReader(
            response.getEntity().getContent()));

    StringBuffer result = new StringBuffer();
    String line = "";
    while ((line = rd.readLine()) != null) {
        result.append(line + "\n");
    }
    System.out.println(result.toString());
}

If I copy and paste the lines from the console, and search through it in an editor for the misspellings, I do in fact have the input text and resulting text area text contained in the huge string. The string contains all html however, and I would like only the misspellings as a string. How would I extract only the resulting misspellings from this site, perhaps with a method as part of the Apache HttpClient Library, or I am taking the wrong approach?

Thanks, Dan

I think you are trying to put a square peg in a round hole, Selenium would probably be a better bet.

Apache http client is best used for request and response header handling not for processing the body of a response

An over complicated way would be to split the "result" variable using regex's

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM