I am trying to submit a form on this website , and get back the resulting misspellings from the text area as a string (only the "Reverse letters" checkbox should be selected). I have the code below, adapted from here :
private static void sendPost() throws Exception {
String url = "http://tools.seobook.com/spelling/keywords-typos.cgi";
HttpClient client = new DefaultHttpClient();
HttpPost post = new HttpPost(url);
post.setHeader("User-Agent", "Mozilla/5.0"); // add header
List<NameValuePair> urlParameters = new ArrayList<NameValuePair>();
//the input text area
urlParameters.add(new BasicNameValuePair("user_input", "tomato potato"));
//the checkbox
urlParameters.add(new BasicNameValuePair("reverse_letters", "reverse_letters"));
//the submit button (?)
urlParameters.add(new BasicNameValuePair("", "generate typos"));
post.setEntity(new UrlEncodedFormEntity(urlParameters));
HttpResponse response = client.execute(post);
System.out.println("\nSending 'POST' request to URL : " + url);
System.out.println("Post parameters : " + post.getEntity());
System.out.println("Response Code : " +
response.getStatusLine().getStatusCode());
BufferedReader rd = new BufferedReader(new InputStreamReader(
response.getEntity().getContent()));
StringBuffer result = new StringBuffer();
String line = "";
while ((line = rd.readLine()) != null) {
result.append(line + "\n");
}
System.out.println(result.toString());
}
If I copy and paste the lines from the console, and search through it in an editor for the misspellings, I do in fact have the input text and resulting text area text contained in the huge string. The string contains all html however, and I would like only the misspellings as a string. How would I extract only the resulting misspellings from this site, perhaps with a method as part of the Apache HttpClient Library, or I am taking the wrong approach?
Thanks, Dan
I think you are trying to put a square peg in a round hole, Selenium would probably be a better bet.
Apache http client is best used for request and response header handling not for processing the body of a response
An over complicated way would be to split the "result" variable using regex's
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.