使用 dplyr 基於 R 中的其他兩列自定義變異新列

Question

我的目標是創建一個新的 df 列，其值基於其他兩列。 我的數據集涉及一項研究的招募。 我想要一個專欄來定義一個人是否在研究的特定輪次中，如果是，則是他們的第一次參與、第二次、第三次等（最多 8 輪）。 目前我正在mutate(case_when))嘗試使用mutate(case_when))並使用lag() 。 然而，如果一個人錯過了一輪研究，后來又回來了，它就會錯誤地工作。 數據集如下所示：

    person |  round  |  in_round  |
       A        1           1
       A        2           1
       A        3           1
       A        4           1
       A        5           1
       A        6           0
       A        7           0
       A        8           0
       B        1           0
       B        2           0
       B        3           1
       B        4           1
       B        5           1
       B        6           1
       B        7           0
       B        8           1

我需要的是一個單獨的列，它為每個人使用round和in_round來生成以下內容：

    person |  round  |  in_round  |  round_status
       A        1           1         recruited
       A        2           1        follow_up_1
       A        3           1        follow_up_2
       A        4           1        follow_up_3
       A        5           1        follow_up_4
       A        6           0           none
       A        7           0           none
       A        8           0           none
       B        1           0           none
       B        2           0           none
       B        3           1         recruited
       B        4           1        follow_up_1
       B        5           1        follow_up_2
       B        6           1        follow_up_3
       B        7           0            none
       B        8           1        follow_up_4

總之：

其中in_round == 0 , round_status == "none"
第一次in_round == 1 , round_status == "recruited"
隨后的時間in_round == 1 ， round_status == "follow_up_X" （取決於個人所在的先前波數）。

Answer 1

嘗試這個：

df %>% 
  group_by(person) %>%
  arrange(round) %>%
  mutate(cum_round = cumsum(in_round),
         round_status = case_when(
    in_round == 0 ~ "none",
    cum_round == 1 ~ "recruited",
    TRUE ~ paste0("follow_up_", cum_round - 1)
  ))

使用 dplyr 基於 R 中的其他兩列自定義變異新列

問題描述

1 個解決方案

解決方案1
2 已采納 2020-03-03 17:19:27

使用 dplyr 基於 R 中的其他兩列自定義變異新列

問題描述

1 個解決方案

解決方案1 2 已采納 2020-03-03 17:19:27

解決方案1
2 已采納 2020-03-03 17:19:27