在具有德语语言环境的 Windows 上将 ISO 8601 年周数与年月数匹配

[英]Match ISO 8601 week-of-year numbers to month-of-year numbers on Windows with German locale

This is directly related to my question POSIX date from dates in weekly time format .这与我的问题POSIX date from days in week time format直接相关。

However, in this question I'd like to specifically ask for how to map ISO 8601 week numbers to month of the year numbers.但是,在这个问题中,我想特别询问如何将ISO 8601周数映射到一年中的月份数。

To me, it seems it is not possible and/or involves some non-intuitive hacks (and even these don't really work reliably) and IMO should thus be considered as something that needs to be fixed in base R .对我来说,这似乎是不可能的和/或涉及一些非直观的黑客(甚至这些也不能真正可靠地工作),因此 IMO 应该被视为需要在基础 R 中修复的东西 Please correct me if I'm wrong, though如果我错了,请纠正我

EDIT: seems like it the issue is closely related to either running on Windows and/or the locale you're on (standard German, in my case)编辑:似乎问题与在 Windows 上运行和/或您所在的语言环境密切相关(标准德语,在我的情况下)

posix <- as.POSIXct(c("2015-12-24", "2015-12-31", "2016-01-01", "2016-01-08"))

ISO 8601 ISO 8601

(yw <- format(posix, "%Y-%V"))
# [1] "2015-52" "2015-53" "2016-53" "2016-01"
ywd <- sprintf("%s-1", yw)
(as.POSIXct(ywd, format = "%Y-%V-%u"))
# [1] "2015-01-12 CET" "2015-01-12 CET" "2016-01-12 CET" "2016-01-12 CET"
# -> utterly wrong!!!

ywd <- sprintf("%s-4", yw)
(as.POSIXct(ywd, format = "%Y-%V-%u"))
# -> still wrong -> the day of the week is not the reason

# -> no way to use ISO 8601 convention to map week of the year to month of the year

For the sake of due dilligence: it's also not possible when trying to use the US or UK conventions:为了尽职调查:尝试使用美国或英国公约时也不可能:

US convention美国公约

(yw <- format(posix, "%Y-%U"))
# [1] "2015-51" "2015-52" "2016-00" "2016-01"
ywd <- sprintf("%s-1", yw)
(as.POSIXct(ywd, format = "%Y-%U-%u"))
# [1] "2015-12-21 CET" "2015-12-28 CET" NA               "2016-01-04 CET"
# -> NA problem for week 00

ywd <- sprintf("%s-4", yw)
# -> does not work for week 00
(as.POSIXct(ywd, format = "%Y-%U-%u"))
# The day of the week is not the reason

# -> no way to use this convention to reliably map week of the year to month of the year

UK convention英国公约

(yw <- format(posix, "%Y-%W"))
# [1] "2015-51" "2015-52" "2016-00" "2016-01"
ywd <- sprintf("%s-1", yw)
(as.POSIXct(ywd, format = "%Y-%W-%u"))
# [1] "2015-12-21 CET" "2015-12-28 CET" NA               "2016-01-04 CET"
# -> NA problem for week 00

ywd <- sprintf("%s-4", yw)
# -> does not work for week 00
(as.POSIXct(ywd, format = "%Y-%W-%u"))
# The day of the week is not the reason

# -> no way to use this convention to reliably map week of the year to month of the year

Session info会话信息

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

[1] LC_COLLATE=German_Germany.1252     LC_CTYPE=German_Germany.1252       LC_MONETARY=German_Germany.1252   
[4] LC_NUMERIC=C                       LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fva_0.1.0       digest_0.6.10   readxl_0.1.1    dplyr_0.5.0     plyr_1.8.4      magrittr_1.5   
 [7] memoise_1.0.0   testthat_1.0.2  roxygen2_5.0.1  devtools_1.12.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8     lubridate_1.6.0 assertthat_0.1  packrat_0.4.8-1 crayon_1.3.2    withr_1.0.2    
 [7] R6_2.2.0        DBI_0.5-1       stringi_1.1.2   rstudioapi_0.6  tools_3.3.2     stringr_1.1.0  
[13] tibble_1.2     

> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.3.2 (2016-10-31)
 system   x86_64, mingw32             
 ui       RStudio (1.0.136)           
 language en                          
 collate  German_Germany.1252         
 tz       Europe/Berlin               
 date     2017-01-12                  

Packages ---------------------------------------------------------------------------------------------------
 package    * version date       source        
 assertthat   0.1     2013-12-06 CRAN (R 3.3.2)
 crayon       1.3.2   2016-06-28 CRAN (R 3.3.2)
 DBI          0.5-1   2016-09-10 CRAN (R 3.3.2)
 devtools   * 1.12.0  2016-06-24 CRAN (R 3.3.2)
 digest     * 0.6.10  2016-08-02 CRAN (R 3.3.2)
 dplyr      * 0.5.0   2016-06-24 CRAN (R 3.3.2)
 fva        * 0.1.0   <NA>       local         
 lubridate    1.6.0   2016-09-13 CRAN (R 3.3.2)
 magrittr   * 1.5     2014-11-22 CRAN (R 3.3.2)
 memoise    * 1.0.0   2016-01-29 CRAN (R 3.3.2)
 packrat      0.4.8-1 2016-09-07 CRAN (R 3.3.2)
 plyr       * 1.8.4   2016-06-08 CRAN (R 3.3.2)
 R6           2.2.0   2016-10-05 CRAN (R 3.3.2)
 Rcpp         0.12.8  2016-11-17 CRAN (R 3.3.2)
 readxl     * 0.1.1   2016-03-28 CRAN (R 3.3.2)
 roxygen2   * 5.0.1   2015-11-11 CRAN (R 3.3.2)
 stringi      1.1.2   2016-10-01 CRAN (R 3.3.2)
 stringr      1.1.0   2016-08-19 CRAN (R 3.3.2)
 testthat   * 1.0.2   2016-04-23 CRAN (R 3.3.2)
 tibble       1.2     2016-08-26 CRAN (R 3.3.2)
 withr        1.0.2   2016-06-20 CRAN (R 3.3.2)

Disclosure: As mentioned in this answer I have created the ISOweek package to deal with ISO 8601 week-based dates.披露:正如在这个答案中提到的,我创建了ISOweek来处理基于 ISO 8601 周的日期。

The question contains several flaws:这个问题有几个缺陷:

  1. The ISO 8601 week-based year is different from the calendar year. ISO 8601 基于周的年份不同于日历年。
  2. Without specifing a day of week, the conversion of year-week to year-month is ambiguous.没有指定星期几,年-周到年-月的转换是不明确的。

Week-based year vs calendar year基于周的年与日历年

The OP has created sample data using OP 使用创建了示例数据

posix <- as.POSIXct(c("2015-12-24", "2015-12-31", "2016-01-01", "2016-01-08"))
(yw <- format(posix, "%Y-%V"))
 [1] "2015-52" "2015-53" "2016-53" "2016-01"

The format specification %Y returns the calendar year which apparently is wrong for the third element.格式规范%Y返回第三个元素显然是错误的日历年。

With the correct format specification %G we do get使用正确的格式规范%G我们确实得到

(yw <- format(posix, "%G-%V"))
 [1] "2015-52" "2015-53" "2015-53" "2016-01"

Conversion of week-of-the-year to month-of-the-year将一年中的一周转换为一年中的月份

Just providing the ISO week-based year and week number without the day of week will yield ambiguous results.仅提供基于 ISO 周的年份和周数而不提供星期几将产生模棱两可的结果。

This can be demonstrated with the (corrected) sample data which now contain three consecutive weeks in the OP's own (non-standard) year-week format:这可以通过(更正的)样本数据来证明,该样本数据现在包含 OP 自己(非标准)年-周格式的连续三周:

 [1] "2015-52" "2015-53" "2016-01"

With help of the ISOweek2date() function from the ISOweek package the data are converted to calendar dates.ISOweek包中的ISOweek2date()函数的帮助下,数据被转换为日历日期。 Note that ISOweek2date() requires a full ISO 8601 week-based date in the format yyyy-Www-d including the day of week.请注意, ISOweek2date()需要格式为yyyy-Www-d的完整 ISO 8601 基于周的日期,包括星期几。 If we choose the first day of the week (Monday) we do get:如果我们选择一周的第一天(星期一),我们会得到:

yw %>% 
  # insert "W" to conform with ISO 8601 format
  sub("-", "-W", .) %>% 
  # append day of week
  paste0("-1") %>%
  # convert to class Date and print as yyyy-mm 
  ISOweek2date() %>% 
 [1] "2015-12" "2015-12" "2016-01"

Now, we repeat this using the last day of the week (Sunday):现在,我们使用一周的最后一天(星期日)重复此操作:

yw %>% 
  sub("-", "-W", .) %>% 
  paste0("-7") %>% 
  ISOweek2date() %>% 
 [1] "2015-12" "2016-01" "2016-01"

Note that the second element now refers to January 2016 instead of December 2015 because the Sunday of week 53 is in January and the Monday of this week still is in December.请注意,第二个元素现在指的是 2016 年 1 月而不是 2015 年 12 月,因为第 53 周的星期日在 1 月,而本周的星期一仍然在 12 月。

R 日期时间格式参数的文档?strptime说“%V”在输入时将被忽略。

Pretty sure something else besides base R needs changing (see note at end tho):很确定除了基本 R 之外的其他内容需要更改(请参阅最后的注释):

some_dates <- as.POSIXct(c("2015-12-24", "2015-12-31", "2016-01-01", "2016-01-08"))

(year_week <- format(some_dates, "%Y %U"))
## [1] "2015 51" "2015 52" "2016 00" "2016 01"

(year_week_day <- sprintf("%s 1", year_week))
## [1] "2015 51 1" "2015 52 1" "2016 00 1" "2016 01 1"

(as.POSIXct(year_week_day, format = "%Y %U %u"))
## [1] "2015-12-21 EST" "2015-12-28 EST" "2016-01-04 EST" "2016-01-04 EST"

It works with the dashes, too:它也适用于破折号:

(year_week <- format(some_dates, "%Y-%U"))
## [1] "2015-51" "2015-52" "2016-00" "2016-01"

(year_week_day <- sprintf("%s-1", year_week))
## [1] "2015-51-1" "2015-52-1" "2016-00-1" "2016-01-1"

(as.POSIXct(year_week_day, format = "%Y-%U-%u"))
## [1] "2015-12-21 EST" "2015-12-28 EST" "2016-01-04 EST" "2016-01-04 EST"

and, despite dashes being OK ISO form, they can lead to confusion in readers when various values aren't >12 or <0并且,尽管破折号是 ISO 格式,但当各种值不大于 12 或小于 0 时,它们可能会导致读者混淆


As the comment thread indicates this is the behaviour on Windows:正如评论线程所指出的,这是 Windows 上的行为:

(year_week <- format(some_dates, "%Y-%U"))
## [1] "2015-51" "2015-52" "2016-00" "2016-01"

(year_week_day <- sprintf("%s-1", year_week))
## [1] "2015-51-1" "2015-52-1" "2016-00-1" "2016-01-1"

(as.POSIXct(year_week_day, format = "%Y-%U-%u"))
## [1] "2015-12-21 PST" "2015-12-28 PST" NA               "2016-01-04 PST"

(Windows 10 64bit, R 3.3.2 for me/this example) (Windows 10 64 位,R 3.3.2 对我来说/这个例子)

