簡體   English   中英

(dplyr) 使用 mutate()、case_when() 和 which() 時出錯

[英](dplyr) Error when using mutate(), case_when() and which()

在以下情況下需要有關錯誤/警告的幫助:

我有一個帶有觀察日期的數據框和一個此類日期的向量。 我想在數據框中創建具有下一個和上一個觀察日期的新列。 日期向量創建為:

library(glue)

dates = c("201902",
          "201906",
          "201911",
          "202002")

dates = glue("{dates}01")

dates = dates%>%
        as.Date(format = "%Y%m%d")%>%
        sort()

然后我的數據框有這個由dates元素組成的列,稱為Date 我希望它創建具有下一個和上一個日期的列,或者如果它是開始/結束,則保持相同。 我正在使用:

library(dplyr)

my_df = my_df%>%
mutate(First_date = (Date == dates[1]),
       Last_date = (Date == dates[length(dates)]),
       Prev_date = case_when(First_date ~ Date,
                                   TRUE ~ dates[which(dates == Date)-1]),
       Next_date = case_when(Last_date ~ Date,
                                  TRUE ~ dates[which(dates == Date)+1])

示例:如果我有一個包含以下列的數據框:

>my_df$Date
[1] "2019-02-01" "2019-06-01" "2019-11-01" "2020-02-01"

我希望它以:

>my_df$First_date
[1] TRUE FALSE FALSE FALSE
>my_df$Last_date
[1] FALSE FALSE FALSE TRUE
>my_df$Prev_date
[1] "2019-02-01" "2019-02-01" "2019-06-01" "2019-11-01"
>my_df$Next_date
[1] "2019-06-01" "2019-11-01" "2020-02-01" "2020-02-01"

我使用的測試數據框有 6 行,它會拋出此錯誤和警告:

Error: `TRUE ~ dates[which(dates == Date) + 1]` must be length 6 or one, not 2
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In `==.default`(dates, Date) :
  longer object length is not a multiple of shorter object length
2: In `==.default`(dates, Date) :
  longer object length is not a multiple of shorter object length

我認為這一定與在case_when()調用which()函數有關,在mutate() ,但我還沒有設法弄清楚它到底出了什么問題。

這是我第一次在這里問,如有錯誤,請見諒!

好的,這應該可以。 首先,你的 df:

dates = c("201902",
          "201906",
          "201911",
          "202002")

dates = glue("{dates}01")

dates = dates%>%
        as.Date(format = "%Y%m%d")%>%
        sort()

my_df <- data.frame(Date = dates)

然后,使用shift函數:

my_df <- my_df %>%
  mutate(First_date = ifelse(Date == dates[1], TRUE, FALSE),
         Last_date = ifelse(Date == dates[length(dates)], TRUE, FALSE),
         Prev_date = shift(dates, n = 1, fill = dates[1]),
         Next_date = shift(dates, n = -1, fill = dates[length(dates)]))
> my_df
        Date First_date Last_date  Prev_date  Next_date
1 2019-02-01       TRUE     FALSE 2019-02-01 2019-06-01
2 2019-06-01      FALSE     FALSE 2019-02-01 2019-11-01
3 2019-11-01      FALSE     FALSE 2019-06-01 2020-02-01
4 2020-02-01      FALSE      TRUE 2019-11-01 2020-02-01

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM