简体   繁体   English

如何使用 DateTimeFormatter 解析非标准月份名称

[英]How to parse non-standard month names with DateTimeFormatter

I need to parse (German) dates that come in the following form:我需要解析以下形式的(德语)日期:

10. Jan. 18:14
8. Feb. 19:02
1. Mär. 19:40
4. Apr. 18:55
2. Mai 21:55
5. Juni 08:25
5. Juli 20:09
1. Aug. 13:42
[...]

As you can see, the month names are cut if the month has more than 4 characters.如您所见,如果月份超过 4 个字符,月份名称将被删除。 Even weirder, don't aks me why, the month of March is shortened to Mär.更奇怪的是,别问我为什么,三月被缩短为Mär. although the whole name is März .虽然全名是März How can I parse this with java.time ?我怎样才能用java.time解析这个? (The dates are formatted based on the localization of the android device that creates the list of dates. However, I'm not parsing it on Android) (日期的格式基于创建日期列表的 android 设备的本地化。但是,我没有在 Android 上解析它)

My approach was to create a DateTimeFormatter like this:我的方法是像这样创建一个DateTimeFormatter

DateTimeFormatter.ofPattern("d. MMMM HH:mm").withLocale(Locale.GERMAN);
// or
DateTimeFormatter.ofPattern("d. MMMMM HH:mm").withLocale(Locale.GERMAN);

But neither the MMMM nor the MMMMM pattern fit the dates that are shortened.但是MMMMMMMMM模式都不适合缩短的日期。 I can, of course, have the following pattern d. MMM. HH:mm我当然可以有以下模式d. MMM. HH:mm d. MMM. HH:mm d. MMM. HH:mm to match the shortened months, but then I can't match the 3 and 4 characters months. d. MMM. HH:mm匹配缩短的月份,但是我无法匹配 3 和 4 个字符的月份。 I am aware that I can have two formatters ( MMM. and MMMMM ) but I would rather have a solution where I have only one formatter and possibly a custom locale or something like this.我知道我可以有两个格式化程序( MMM. and MMMMM ),但我宁愿有一个解决方案,我只有一个格式化程序,可能还有一个自定义语言环境或类似的东西。

The answer to the problem is the DateTimeFormatterBuilder class and the appendText(TemporalField, Map) method.问题的答案是DateTimeFormatterBuilder类和appendText(TemporalField, Map)方法。 It allows any text to be associated with a value when formatting or parsing, which solves the problem effectively and elegantly:它允许在格式化或解析时将任何文本与一个值相关联,从而有效而优雅地解决了问题:

Map<Long, String> monthNameMap = new HashMap<>();
monthNameMap.put(1L, "Jan.");
monthNameMap.put(2L, "Feb.");
monthNameMap.put(3L, "Mar.");
DateTimeFormatter fmt = new DateTimeFormatterBuilder()
    .appendPattern("d. ")
    .appendText(ChronoField.MONTH_OF_YEAR, monthNameMap)
    .appendPattern(" HH:mm")
    .parseDefaulting(ChronoField.YEAR, 2016)
    .toFormatter();

System.out.println(LocalDateTime.parse("10. Jan. 18:14", fmt));
System.out.println(LocalDateTime.parse("8. Feb. 19:02", fmt));

Some notes:一些注意事项:

  • The monthNameMap must be populated with all 12 months monthNameMap必须填充所有 12 个月
  • The formatter should normally be assigned to a static final constant, rather than being created all the time格式化程序通常应分配给静态最终常量,而不是一直创建
  • The parseDefaulting(YEAR, 2016) has been added so that LocalDateTime.parse(String, DateTimeFormatter) can be used directly.添加了parseDefaulting(YEAR, 2016)以便可以直接使用LocalDateTime.parse(String, DateTimeFormatter) Without it, there would be no year, and thus nothing more than a TemporalAccessor could be parsed (the year must be a leap year, in case 29th Feb is being parsed)没有它,就没有年份,因此只能解析TemporalAccessor (年份必须是闰年,以防解析 2 月 29 日)

You could use a DateTimeFormatterBuilder :您可以使用DateTimeFormatterBuilder

private static final DateTimeFormatter formatter = new DateTimeFormatterBuilder()
            .appendOptional(DateTimeFormatter.ofPattern("d. MMM. HH:ss"))
            .appendOptional(DateTimeFormatter.ofPattern("d. MMMM HH:ss"))
            .toFormatter(Locale.GERMAN);

Running it on this:运行它:

Stream.of(("10. Jan. 18:14\n" +
           "8. Feb. 19:02\n" +
           "1. Mär. 19:40\n" +
           "4. Apr. 18:55\n" +
           "2. Mai 21:55\n" +
           "5. Juni 08:25\n" +
           "5. Juli 20:09\n" +
           "1. Aug. 13:42").split("\n"))
       .map(formatter::parse)
       .forEach(System.out::println);

you get:你得到:

{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=10, MonthOfYear=1, MilliOfSecond=0, SecondOfMinute=14, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=8, MonthOfYear=2, MilliOfSecond=0, SecondOfMinute=2, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=3, MilliOfSecond=0, SecondOfMinute=40, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=4, MonthOfYear=4, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=2, MonthOfYear=5, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=21},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=6, MilliOfSecond=0, SecondOfMinute=25, HourOfDay=8},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=7, MilliOfSecond=0, SecondOfMinute=9, HourOfDay=20},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=8, MilliOfSecond=0, SecondOfMinute=42, HourOfDay=13},ISO

As pointed out it would be easier to use a standard and consistent format - here you are mixing long and short month names.正如所指出的那样,使用标准且一致的格式会更容易 - 在这里您混合了长月名和短月名。

One option (short of using a DateTimeFormatterBuilder ) is to handle both cases separately:一种选择(不使用DateTimeFormatterBuilder )是分别处理这两种情况:

private static final DateTimeFormatter SHORT_MONTH = DateTimeFormatter.ofPattern("d. MMM. HH:ss", Locale.GERMAN);
private static final DateTimeFormatter LONG_MONTH = DateTimeFormatter.ofPattern("d. MMMM HH:ss", Locale.GERMAN);
private static TemporalAccessor parse(String s) {
  try {
    return SHORT_MONTH.parse(s);
  } catch (DateTimeParseException e) {
    return LONG_MONTH.parse(s);
  }
}

You can regex replace the month portion so it's always 3 characters length before parsing it using "d. MMM HH:mm"您可以使用正则表达式替换月份部分,因此在使用 "d. MMM HH:mm" 解析它之前,它始终是 3 个字符的长度

text = text.replaceFirst("(\\S+\\s\\S{3})\\S", "$1")

Explanation for the regex part: Find 1 or more non-whitespace (\\S+) followed by 1 whitespace (\\s) followed by three non-whitespace (\\S{3}) followed by one non-whitespace, and replace it with the portion inside first bracket ($1)正则表达式部分说明:找到 1 个或多个非空格 (\\S+) 后跟 1 个空格 (\\s) 后跟三个非空格 (\\S{3}) 后跟一个非空格,并将其替换为第一个括号内的部分($ 1)

10. Jan. 18:14 will become 10. Jan 18:14 and 5. Juni 08:25 will become 5. Jun 08:25 10. Jan. 18:14将变为10. Jan 18:145. Juni 08:25将变为5. Jun 08:25

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM