简体   繁体   中英

Lenient SimpleDateFormat acting strange

I understand that, in order to properly validate date strings, one must make DateFormat instances non-lenient to get all ParseExceptions from malformed dates. But consider

String  dubiousDate = "2014-04-01";
DateFormat  sdf = new SimpleDateFormat( "yyyyMMdd");
Date d;
try {
    d = sdf.parse( dubiousDate);
    System.out.println( dubiousDate + " -> " + d);
} catch ( ParseException e) {
    e.printStackTrace();
    System.err.println( dubiousDate + " failed");
}

this will give

2014-04-01 -> Wed Dec 04 00:00:00 CET 2013

Now I can understand that the lenient calendars try to be nice and accept funny negative numbers, but this interpretation looks like the -01 is considered as month, even though it appears last, where the days are. And the -04 months become 04 days, with the minus ignored.

In all leniency, why would this make sense to anyone?

I see another possible interpretation:

In the pattern yyyyMMdd the month part is limited to exact two chars because there are no separators between the different numerical fields. So "-0" will be seen as month which is just zero and is one month behind January yielding December in previous year.

After having "parsed" the fake month, the day part comes with "4" stopping before the second minus char. The result is then the fourth of December.

Finally, the remaining chars "-01" are simply ignored. This is typical for the class SimpleDateFormat about how to handle non-digit trailing chars, for example see this code:

String dubiousDate = "2014-04-01xyz";
DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
Date d;
try {
    d = sdf.parse(dubiousDate);
    System.out.println(dubiousDate + " -> " + d);
    // output: Tue Apr 01 00:00:00 CEST 2014
} catch (ParseException e) {
    e.printStackTrace();
    System.err.println(dubiousDate + " failed");
}

As thumb rule, with only two equal symbol chars MM or dd the parser will only consume up to at most two chars (if digits are found).

Some research about Java 8:

DateTimeFormatterBuilder builder = new DateTimeFormatterBuilder();
builder.parseLenient();
builder.append(DateTimeFormatter.ofPattern("yyyyMMdd"));
DateTimeFormatter dtf = builder.toFormatter();
String dubiousDate = "2014-04-01";
LocalDate date = LocalDate.parse(dubiousDate, dtf);
System.out.println(date);

According to JDK-8-documentation the formatter constructed this way should behave leniently, but unfortunately still throws an exception:

"Exception in thread "main" java.time.format.DateTimeParseException: Text '2014-04-01' could not be parsed at index 3"

Best option would be in lenient case - theoretically - if the parser just ignores the minus chars. But obviously this is not possible with JSR-310 (still too strict). Well, SimpleDateFormat is lenient, but in rather a wrong way.

This doesn't make sense. It sounds like a bug to me.

I think the right answer is to wait for Java 8 where dates are finally done right. Your code, for example, could now change to something like what is below. And, Java will throw an exception, like it should.

import java.util.*;
import java.lang.*;
import java.io.*;

import java.text.DateFormat;
import java.text.ParseException;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;

public class Main {
  public static void main(String[] args) {
    String dubiousDate = "2014-04-01";
    LocalDate d;
    try {
        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyyMMdd");
        d = LocalDate.parse(dubiousDate, formatter);
        System.out.println(dubiousDate + " -> " + d);
      }
      catch (Exception e) {
        e.printStackTrace();
        System.err.println(dubiousDate + " failed");
      }
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM