I am scraping a webpage that contains dates in this format: "8th November 2013". After I have returned the dates they are organized into an unordered array of strings. What I want to do then is somehow convert these strings to a simple date format like yyyy-MM-dd so I can order them and use them for interacting with the calendar?
How about something like this?
private String dateLongStringConvert(String dateLongString) {
// split long date string into string array
String[] dateArray = dateLongString.split(" ");
// get day of month as an integer (strip out non numeric chars)
int dayOfMonth = Integer.parseInt(dateArray[0].replaceAll("\\D+", ""));
// Convert month string to number
String month = "";
switch (dateArray[1]) {
case "January":
month = "01";
case "Feburary":
month = "02";
case "March":
month = "03";
case "April":
month = "04";
case "May":
month = "05";
case "June":
month = "06";
case "July":
month = "07";
case "August":
month = "08";
case "September":
month = "09";
case "October":
month = "10";
case "Novemember":
month = "11";
case "December":
month = "12";
}
// return formated date string
return dateArray[2] + "-" + month + "-" + String.format("%02d", dayOfMonth);
}
String inputDate = "8th November 2013";
inputDate = inputDate.replaceAll("([0-9])st|nd|rd|th|\\.", "$1"); // get rid of the th.
Date date = new SimpleDateFormat("d MMM y", Locale.ENGLISH).parse(inputDate); // parse input date
String outputDate = new SimpleDateFormat("yyyy-MM-dd").format(date); // format to output date
Proper way to do such thing is to use a parser like Stanford Temporal Tagger and figure out dates from the text. A nice GUI( http://nlp.stanford.edu:8080/sutime/process ) is provided by the team to evaluate the tool
to_char('YYYY / MM / DD HH24:MI:ss')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.