[英]Why does parsing “0000:00:00 00:00:00” into a Date return -0001-11-28T00:00:00Z?
Why does the following code output -0001-11-28T00:00:00Z
instead of 0000-00-00T00:00:00Z
?为什么下面的代码 output -0001-11-28T00:00:00Z
而不是0000-00-00T00:00:00Z
?
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.text.ParseException;
import java.util.Date;
import java.util.TimeZone;
class Main
{
public static void main (String[] args) throws ParseException
{
DateFormat parser = new SimpleDateFormat("yyyy:MM:dd HH:mm:ss");
parser.setTimeZone(TimeZone.getTimeZone("GMT"));
Date date = parser.parse("0000:00:00 00:00:00");
System.out.println(date.toInstant());
}
}
My first thought was that this was a time-zone problem, but the output is a whopping 34 days earlier than the expected date.我的第一个想法是这是一个时区问题,但 output 比预期日期早了 34 天。
This is a 3rd-party library so I cannot actually modify the code but if I can understand why it is returning this value then maybe I can tweak the inputs to get the desired output.这是一个第 3 方库,因此我实际上无法修改代码,但如果我能理解它为什么返回此值,那么也许我可以调整输入以获得所需的 output。
In case you're wondering, 0000:00:00 00:00:00
comes from EXIF metadata of images or videos.如果您想知道, 0000:00:00 00:00:00
来自图像或视频的EXIF 元数据。
Note that there is no differentiation between year-of-era and year in the legacy API.请注意,旧版 API 中的年份和年份之间没有区别。 The year, 0
is actually 1 BC
.年份0
实际上是1 BC
。 The month, 0
and day, 0
are invalid values but instead of throwing an exception SimpleDateFormat
parses them erroneously.月0
和日0
是无效值,但SimpleDateFormat
不会抛出异常,而是错误地解析它们。
The reason for the month being converted to 11
:月份转换为11
的原因:
The SimpleDateFormat
decreases the month numeral in the text by 1
because java.util.Date
is 0
based. SimpleDateFormat
将文本中的月份数字减1
,因为java.util.Date
从0
开始。 In other words, month, 1
is parsed by SimpleDateFormat
as 0
which is month Jan
for java.util.Date
.换句话说,月1
被SimpleDateFormat
解析为0
,即java.util.Date
的Jan
。 Similarly, month, 0
is parsed by SimpleDateFormat
as -1
.同样,月, 0
被SimpleDateFormat
解析为-1
。 Now, a neagtive month is treated by java.util.Date
as follows:现在,负月份由java.util.Date
处理如下:
month = CalendarUtils.mod(month, 12);
and the CalendarUtils#mod
has been defined as follows:并且CalendarUtils#mod
定义如下:
public static final int mod(int x, int y) {
return (x - y * floorDivide(x, y));
}
public static final int floorDivide(int n, int d) {
return ((n >= 0) ?
(n / d) : (((n + 1) / d) - 1));
}
Thus, CalendarUtils.mod(-1, 12)
returns 11
.因此, CalendarUtils.mod(-1, 12)
返回11
。
java.util.Date
and SimpleDateFormat
are full of such surprises. java.util.Date
和SimpleDateFormat
充满了这样的惊喜。 It is recommended to stop using them completely and switch to the modern date-time API .建议完全停止使用它们并切换到现代日期时间 API 。
The modern date-time API differentiates between year-of-era and year using y
and u
respectively.现代日期时间 API 分别使用y
和u
区分时代和年份。
y
specifies the year-of-era (era is specified as AD
or BC
) and is always a positive number whereas u
specifies the year which is a signed (+/-) number. y
指定时代的年份(时代指定为AD
或BC
)并且始终为正数,而u
指定年份,它是带符号的 (+/-) 数字。
Normally, we do not use +
sign to write a positive number but we always specify a negative number with a -
sign.通常,我们不使用+
号来写入正数,但我们总是用-
号指定负数。 The same rule applies for a year .同样的规则适用于一年。 As long as you are going to use a year of the era, AD
, both, y
and u
will give you the same number.只要您要使用时代的年份, AD
, y
和u
都会给您相同的数字。 However, you will get different numbers when you use a year of the era, BC
eg the year-of-era , 1 BC
is specified as year , 0
;但是,当您使用时代的年份时,您会得到不同的数字,例如BC
年份, 1 BC
指定为year , 0
; the year-of-era , 2 BC
is specified as year , -1
and so on. year-of-era , 2 BC
被指定为year , -1
等等。
You can understand it better with the following demo:您可以通过以下演示更好地理解它:
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
public class Testing {
public static void main(String[] args) {
System.out.println(LocalDate.of(-1, 1, 1).format(DateTimeFormatter.ofPattern("u M d")));
System.out.println(LocalDate.of(-1, 1, 1).format(DateTimeFormatter.ofPattern("y M d")));
System.out.println(LocalDate.of(-1, 1, 1).format(DateTimeFormatter.ofPattern("yG M d")));
System.out.println();
System.out.println(LocalDate.of(0, 1, 1).format(DateTimeFormatter.ofPattern("u M d")));
System.out.println(LocalDate.of(0, 1, 1).format(DateTimeFormatter.ofPattern("y M d")));
System.out.println(LocalDate.of(0, 1, 1).format(DateTimeFormatter.ofPattern("yG M d")));
System.out.println();
System.out.println(LocalDate.of(1, 1, 1).format(DateTimeFormatter.ofPattern("u M d")));
System.out.println(LocalDate.of(1, 1, 1).format(DateTimeFormatter.ofPattern("y M d")));
System.out.println(LocalDate.of(1, 1, 1).format(DateTimeFormatter.ofPattern("yG M d")));
}
}
Output: Output:
-1 1 1
2 1 1
2BC 1 1
0 1 1
1 1 1
1BC 1 1
1 1 1
1 1 1
1AD 1 1
0000:00:00 00:00:00
?现代日期时间 API 如何对待0000:00:00 00:00:00
?import java.time.ZoneOffset;
import java.time.ZonedDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
class Main {
public static void main(String[] args) {
DateTimeFormatter parser = DateTimeFormatter.ofPattern("uuuu:MM:dd HH:mm:ss")
.withZone(ZoneOffset.UTC)
.withLocale(Locale.ENGLISH);
ZonedDateTime zdt = ZonedDateTime.parse("0000:00:00 00:00:00", parser);
}
}
Output: Output:
Exception in thread "main" java.time.format.DateTimeParseException: Text '0000:00:00 00:00:00' could not be parsed: Invalid value for MonthOfYear (valid values 1 - 12): 0
....
With DateTimeFormatter#withResolverStyle(ResolverStyle.LENIENT)
:使用DateTimeFormatter#withResolverStyle(ResolverStyle.LENIENT)
:
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.time.format.ResolverStyle;
import java.util.Locale;
public class Main {
public static void main(String[] args) {
DateTimeFormatter dtf = DateTimeFormatter.ofPattern("uuuu-MM-dd HH:mm:ss", Locale.ENGLISH)
.withResolverStyle(ResolverStyle.LENIENT);
String str = "0000-00-00 00:00:00";
LocalDateTime ldt = LocalDateTime.parse(str, dtf);
System.out.println(ldt);
}
}
Output: Output:
-0001-11-30T00:00
As explained by other answers, this is a result of processing an invalid timestamp (invalid year, month and day values) with a legacy class ( SimpleDateFormat
) that doesn't do proper validation.正如其他答案所解释的,这是使用未进行正确验证的遗留 class ( SimpleDateFormat
)处理无效时间戳(无效的年、月和日值)的结果。
In short... garbage in, garbage out 1 .简而言之...垃圾进,垃圾出1 。
Solutions:解决方案:
Rewrite the code that uses SimpleDateFormat
to use the new date / time classes introduced in Java 8. (Or use a backport if you have to use Java 7 and earlier.)重写使用SimpleDateFormat
的代码,以使用 Java 8 中引入的新日期/时间类。(如果必须使用 Java 7 及更早版本,则使用反向端口。)
Work around the problem by testing for this specific case before you attempt to process the string as a date.在尝试将字符串作为日期处理之前,通过测试此特定情况来解决此问题。
It seems from the context that "0000:00:00 00:00:00" is the EXIF way of saying "no such datetime".从上下文看来,“0000:00:00 00:00:00”是 EXIF 表示“没有这样的日期时间”的方式。 If that's the case, then trying to treat it as a datetime seems counter-productive.如果是这种情况,那么试图将其视为日期时间似乎适得其反。 Treat it as a special case instead.而是将其视为特殊情况。
If you can't rewrite the code or work around the problem, submit a bug report and/or patch against the (3rd party) library and hope for the best...如果您无法重写代码或解决问题,请针对(第 3 方)库提交错误报告和/或补丁,并希望获得最好的结果......
1 - Why the discrepancy is exactly 1 year and 34 days is a bit of a mystery, but I'm sure you could figure out the explanation by diving into the source code. 1 - 为什么差异恰好是 1 年 34 天有点神秘,但我相信您可以通过深入研究源代码来找出解释。 IMO, it is not worth the effort. IMO,这不值得努力。 However, I can't imagine why the Gregorian shift would be implicated in this...但是,我无法想象为什么格里高利转变会牵涉其中……
This is because year 0 is invalid, it doesn't exist.这是因为第 0 年是无效的,它不存在。 https://en.m.wikipedia.org/wiki/Year_zero https://en.m.wikipedia.org/wiki/Year_zero
Month,day are also invalid by being 0.月、日也为 0 无效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.