简体   繁体   中英

SimpleDateFormat: Inconsistent Pattern Letters

Recently I looked into the Documentation for SimpleDateFormat and noticed some inconsistencies (in my opinion) in how they handle the letters for parsing.

For example, look at these representations:

M: Month in year
D: Day in year
d: Day in month

"x in year" is a bigger timespan than "x in month" and has therefore uppercase letters so this makes perfect sense to me.

But then there is

w: Week in year
W: Week in month

Here, the letters are swapped, which is totally counter-intuitive in my opinion. It seems like these two should be the other way around, to conform to the "pattern" mentioned above.

Another example are the different hour-representations:

H: Hour in day (0-23)
k: Hour in day (1-24)
K: Hour in am/pm (0-11)
h: Hour in am/pm (1-12)

I kinda get the idea. Uppercase letters for hours starting with 0, lowercase letters for hours starting with 1. Here, both lowercase letters should be swapped, because shouldn't the same letters belong to the same category? ( H/h for hour in day, K/k for hour in am/pm)

So my question is this: Is there a reason behind this seemingly counter-intuitive representation?

The only reason i could think of is, that some of these pattern letters were added at a later time and they couldn't change the already existing ones, because of downwards compatibility. But other than that, it doesn't make much sense to me.

Citation:

"The only reason i could think of is, that some of these pattern letters were added at a later time and they couldn't change the already existing ones, because of downwards compatibility."

Your suspicion is correct. But you cannot (only) blame Sun respective Oracle designers for that. They have just overtaken the whole stuff originally from Taligent (now merged into IBM). And IBM itself is one of the leading companies behind Unicode consortium which defined the CLDR-standard . In that standard all these pattern symbols were defined (indeed in a totally inconsistent manner - only explainable by historic development).

Worse, the inconsistencies in CLDR don't stop: Recently we have got a NARROW variant in addition to SHORT, LONG etc. That means if you want the shortes possible representation of a month as a single letter then you need to specify the pattern symbol MMMMM (5 letters because one letter M is already reserved for the numerical short form).

Another notice: SimpleDateFormat does not even strictly follow CLDR. For example Oracle has defined the pattern symbol "u" as ISO-Day number of week (1 = Monday, ..., 7 = Sunday) in Java-version 7 although CLDR has already introduced the same symbol earlier as the proleptic ISO-year. And Java 8 again deviates, invents new symbols not known in CLDR but else tries to follow CLDR more closely.

We have already remarkable differences using pattern languages (compare Java-6, Java-7, Java-8, pure CLDR and Joda-Time). And I fear this will never stop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM