简体   繁体   中英

Java Week to Date conversion for US calendar (non-ISO8601)

I need to convert strings of pattern 'YYYY0ww' to dates - based on US calendar (minimial days in week = 1, first day of the week = Sunday). Examples: 2020051, 2020052,...

Currently I'm using the following approach for building appropriate formatters (using java.time packages):

DateTimeFormatter weekFormat = DateTimeFormatterBuilder()
                    .parseDefaulting(ChronoField.DAY_OF_WEEK, WeekFields.SUNDAY_START.getFirstDayOfWeek().getValue())
                    .appendValue(WeekFields.SUNDAY_START.weekBasedYear(), 4)
                    .appendValue(WeekFields.SUNDAY_START.weekOfWeekBasedYear(), 3).parseStrict().toFormatter();

DateTimeFormatter dateFormat = DateTimeFormatter.ofPattern("yyyyMMdd").withLocale(Locale.US);

The resolved date for a given week must be based on the Sunday of the given week.

It seems there's an obscure bug because one the following tests fails:

TestCase.assertEquals("20201213", LocalDate.parse("2020051", weekFormat).format(dateFormat)); // success
TestCase.assertEquals("20201220", LocalDate.parse("2020052", weekFormat).format(dateFormat)); // success
TestCase.assertEquals("20201227", LocalDate.parse("2020053", weekFormat).format(dateFormat)); // fail - parsed date contains year=2020, month=12, day=20 which would be CW 52/2020.
TestCase.assertEquals("20210103", LocalDate.parse("2021001", weekFormat).format(dateFormat)); // success

According to https://www.calendar-365.com/2020-calendar.html there exists a week 53 in 2020 in the US calendar - so "2020053" should be 20201227.

I already tried to apply.withResolverStyle(ResolverStyle.LENIENT) on the week formatter (because I noticed in the JDK implementation that strict parsing for some reason clamps sometimes - see WeekFields.ofWeekBasedYear() ). This makes the CW 53/2020 case work but then the test for "2021001" fails in a similar manner:

TestCase.assertEquals("20201213", LocalDate.parse("2020051", weekFormat).format(dateFormat)); // success
TestCase.assertEquals("20201220", LocalDate.parse("2020052", weekFormat).format(dateFormat)); // success
TestCase.assertEquals("20201227", LocalDate.parse("2020053", weekFormat).format(dateFormat)); // success
TestCase.assertEquals("20210103", LocalDate.parse("2021001", weekFormat).format(dateFormat)); // fail - parsed date contains year=2020, month=12, day=27 which would be CW 53/2020.

I'm confused that it's so tricky to get week to date conversion correct for non-ISO8601 calendars - I think this use-case is not very exotic.

Would appreciate feedback why my approach does not deliver correct results.

Update: after thinking more about this - I think I might have trapped into incorrectly believing that there is a calendar week 53 in US calendars in certain years. At least the linked websites mentioned CW 53. But based on the rule that the first week of the year is the week (starting with Sunday) which covers January 1st... probably means that there's never a calendar week 53 (and some websites got it wrong).

Posting this answer myself as Ole VV provided the essential analysis but declined to create the answer (explanation in his profile) - so kudos to Ole VV

First of all it's essential to be clear about which years have a calendar week 53 for the US calendar. As the rules for US calendars have not been spec'ed out clearly as a standard (like ISO8601) the most common rule seems to be that Jan 1st defined the first calendar week (based on weeks starting on Sunday and ending on Saturday). Note that many publicly available US calendars found on a well-known search engine contain incorrect week indices because of incorrectly determined first calender weeks.

I wrote a small (unpolished) programm to calculate the weeks based on this:

import java.util.HashMap;
import java.util.Map;

import org.joda.time.LocalDate;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;

import junit.framework.TestCase;

public class GenerateCWs_TestCase extends TestCase {

    public void test_generate() {

        int startYear = 2015;
        int EndYear = 2028;

        Map<Integer, LocalDate> cw1StartDates = new HashMap<>();

        DateTimeFormatter formatter = DateTimeFormat.forPattern("yyyyMMdd");

        for (int year = startYear; year <= EndYear + 1; year++) {

            LocalDate firstDayInYear = formatter.parseLocalDate(year + "0101");

            System.out.println("Jan 1st. " + year + " -> day of week: " + firstDayInYear.getDayOfWeek());

            LocalDate sundayOfFirstWeek = firstDayInYear.getDayOfWeek() != 7
                    ? firstDayInYear.minusDays(firstDayInYear.getDayOfWeek() - 1 + 1)
                    : firstDayInYear;

            System.out.println("CW1 starts on: " + formatter.print(sundayOfFirstWeek));

            cw1StartDates.put(year, sundayOfFirstWeek);

        }

        for (int year = startYear; year <= EndYear; year++) {

            LocalDate sundayDateOfWeek = cw1StartDates.get(year);

            int weekIndex = 1;

            System.out.println(String.format("CW %02d/%d - Sunday of Week: %s", weekIndex, year,
                    formatter.print(sundayDateOfWeek)));

            while (sundayDateOfWeek.plusWeeks(1).isBefore(cw1StartDates.get(year + 1))) {

                sundayDateOfWeek = sundayDateOfWeek.plusWeeks(1);
                weekIndex++;

                if (weekIndex <= 2 || weekIndex >= 51)
                    System.out.println(String.format("CW %02d/%d - Sunday of Week: %s", weekIndex, year,
                            formatter.print(sundayDateOfWeek)));

            }
            
            System.out.println();

        }

    }

}

which outputs:

CW 01/2015 - Sunday of Week: 20141228
CW 02/2015 - Sunday of Week: 20150104
CW 51/2015 - Sunday of Week: 20151213
CW 52/2015 - Sunday of Week: 20151220
CW 01/2016 - Sunday of Week: 20151227
CW 02/2016 - Sunday of Week: 20160103
CW 51/2016 - Sunday of Week: 20161211
CW 52/2016 - Sunday of Week: 20161218
CW 53/2016 - Sunday of Week: 20161225
CW 01/2017 - Sunday of Week: 20170101
CW 02/2017 - Sunday of Week: 20170108
CW 51/2017 - Sunday of Week: 20171217
CW 52/2017 - Sunday of Week: 20171224
CW 01/2018 - Sunday of Week: 20171231
CW 02/2018 - Sunday of Week: 20180107
CW 51/2018 - Sunday of Week: 20181216
CW 52/2018 - Sunday of Week: 20181223
CW 01/2019 - Sunday of Week: 20181230
CW 02/2019 - Sunday of Week: 20190106
CW 51/2019 - Sunday of Week: 20191215
CW 52/2019 - Sunday of Week: 20191222
CW 01/2020 - Sunday of Week: 20191229
CW 02/2020 - Sunday of Week: 20200105
CW 51/2020 - Sunday of Week: 20201213
CW 52/2020 - Sunday of Week: 20201220
CW 01/2021 - Sunday of Week: 20201227
CW 02/2021 - Sunday of Week: 20210103
CW 51/2021 - Sunday of Week: 20211212
CW 52/2021 - Sunday of Week: 20211219
CW 01/2022 - Sunday of Week: 20211226
CW 02/2022 - Sunday of Week: 20220102
CW 51/2022 - Sunday of Week: 20221211
CW 52/2022 - Sunday of Week: 20221218
CW 53/2022 - Sunday of Week: 20221225
CW 01/2023 - Sunday of Week: 20230101
CW 02/2023 - Sunday of Week: 20230108
CW 51/2023 - Sunday of Week: 20231217
CW 52/2023 - Sunday of Week: 20231224
CW 01/2024 - Sunday of Week: 20231231
CW 02/2024 - Sunday of Week: 20240107
CW 51/2024 - Sunday of Week: 20241215
CW 52/2024 - Sunday of Week: 20241222
CW 01/2025 - Sunday of Week: 20241229
CW 02/2025 - Sunday of Week: 20250105
CW 51/2025 - Sunday of Week: 20251214
CW 52/2025 - Sunday of Week: 20251221
CW 01/2026 - Sunday of Week: 20251228
CW 02/2026 - Sunday of Week: 20260104
CW 51/2026 - Sunday of Week: 20261213
CW 52/2026 - Sunday of Week: 20261220
CW 01/2027 - Sunday of Week: 20261227
CW 02/2027 - Sunday of Week: 20270103
CW 51/2027 - Sunday of Week: 20271212
CW 52/2027 - Sunday of Week: 20271219
CW 01/2028 - Sunday of Week: 20271226
CW 02/2028 - Sunday of Week: 20280102
CW 51/2028 - Sunday of Week: 20281210
CW 52/2028 - Sunday of Week: 20281217
CW 53/2028 - Sunday of Week: 20281224

So year 2022 has a calendar week 53 in the US calendar. Year 2020 does not have a calendar week 53. The following section is based on trying to calculate the Sunday of year 2020 calendar week 53.

From here the in-depth analysis from Ole VV in the comments applies which describes the very unexpected behaviour of DateTimeFormatter in the java.time package: Parsing a year+week info for CW53 in year 2020 with given first day of the week (Sunday) results in 2020/12/20 which however is calendar week 52 (without any indication of an exceptional situation). Programs not aware of this would silently continue processing with the returned calendar date from week 52 although the input was calendar week 53.

The use of ResolverStyle.STRICT does not make any difference in this case.

The only protection possible is to calculate a calendar date for a given year+week and then add an additional check by reversing the calculation. If input week and output week are mismatching then the described misbehaviour occurred and special handling in client code is required.

Ole was so kind to report a bug: JDK-8293146: Strict DateTimeFormatter fails to report an invalid week 53 in the Oracle Java Bug Database

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM