簡體   English   中英

當列中有 NA 時,如何根據條件對列中的變量進行排名

[英]How to rank a variable in a column based on a conditional, when there are NAs in the column

我有一個包含兩個人的縱向數據集,其中數據行編號為“劇集”,有些劇集有一個測試“結果”。 以下代碼的目標是:

  1. 創建二進制變量“sup”以評估“結果”。 如果結果 == NA,則 sup == NA。 此代碼有效。
  2. 創建 sup_rank 以枚舉出現 sup==1 的人中出現 sup==1 的情況。 換句話說,我想知道sup==1是不是第一次,第二次等等。 問題:此代碼目前不起作用,因為第 2 個人的第一個 sup==1 排名為“2”(當它應該排名為“1”時)。
  3. 創建一個事件變量:
  • 如果 sup_rank==1 則等於 1
  • 如果 sup == 0sup_rank 不等於 1,則等於 0
  • 如果 sup(因此 sup_rank)等於 NA,則等於 NA

目前,我嘗試通過事件和事件決賽分兩步完成#3。 問題:它不起作用,因為“sup_rank”不起作用,但無論如何,將“event”創建為一個變量(而不需要“event_final”)是理想的。

#Load packages
pacman::p_load(dplyr)

#Create variables for data set 
person <- c(1, 1, 2, 2, 2, 2, 2, 2, 2, 2)
episode <- c(1, 2, 1, 2, 3, 4, 5, 6, 7, 8)
result <- c(NA, NA, NA, 1, NA, 2, NA, 2, NA, 2)

#Populate data frame with variables
d <- cbind(person, episode, result)
d <- as.data.frame(d)

#Manipulate data frame to create 4 new variables
d1 <- d %>%
  #Need to create new variables within each person
  group_by(person) %>%
  #Need to correctly order the rows of data before creating the variables
  arrange(person, episode) %>%
  #Create variable to evaluate 'result'
  mutate(sup = if_else(result == 2, 1, 0, NA_real_)) %>%
  #if sup == 1, rank it
  mutate(sup_rank = ifelse(sup == 1, rank(sup == 1, na.last = 'keep', ties.method = 'first'), NA_real_)) %>%
  #create an event if the rank of the sup == 1 is equal to 1 (we want the initial suppression)
  mutate(event = if_else(sup_rank == 1, 1, 0, NA_real_)) %>%
  #now override the value of event to be equal to 0 if sup==0
  mutate(event_final = if_else(sup == 0, 0, event)) %>%
  arrange(person, episode)

print(d1)
#> # A tibble: 10 x 7
#> # Groups:   person [2]
#>    person episode result   sup sup_rank event event_final
#>     <dbl>   <dbl>  <dbl> <dbl>    <dbl> <dbl>       <dbl>
#>  1      1       1     NA    NA       NA    NA          NA
#>  2      1       2     NA    NA       NA    NA          NA
#>  3      2       1     NA    NA       NA    NA          NA
#>  4      2       2      1     0       NA    NA           0
#>  5      2       3     NA    NA       NA    NA          NA
#>  6      2       4      2     1        2     0           0
#>  7      2       5     NA    NA       NA    NA          NA
#>  8      2       6      2     1        3     0           0
#>  9      2       7     NA    NA       NA    NA          NA
#> 10      2       8      2     1        4     0           0

reprex package (v2.0.0) 創建於 2022-04-20

確實有一種更有效的方法可以做到這一點,但與此同時,這是我創建的一個解決方案:

#Load packages
pacman::p_load(dplyr)

#Create variables for data set 
person <- c(1, 1, 2, 2, 2, 2, 2, 2, 2, 2)
episode <- c(1, 2, 1, 2, 3, 4, 5, 6, 7, 8)
result <- c(NA, NA, NA, 1, NA, 2, NA, 2, NA, 2)

#Populate data frame with variables
d <- cbind(person, episode, result)
d <- as.data.frame(d)

#Manipulate data frame to create 5 new variables
d1 <- d %>%
  #Need to create new variables within each person
  group_by(person) %>%
  #Need to correctly order the rows of data before creating the variables
  arrange(person, episode) %>%
  #Create variable to evaluate 'result'
  mutate(sup = if_else(result == 2, 1, 0, NA_real_)) %>%
  #Create a flag for each time sup==1
  mutate(sup_flag = if_else(sup == 1, 1, NA_real_, NA_real_)) %>%
  #if sup == 1, rank it
  mutate(sup_rank = ifelse(sup == 1, rank(sup_flag, na.last = 'keep', ties.method = 'first'), NA_real_)) %>%
  #create an event if the rank of the sup == 1 is equal to 1 (we want the initial suppression)
  mutate(event = if_else(sup_rank == 1, 1, 0, NA_real_)) %>%
  #now override the value of event to be equal to 0 if sup==0
  mutate(event_final = if_else(sup == 0, 0, event)) %>%
  arrange(person, episode)

print(d1)
#> # A tibble: 10 x 8
#> # Groups:   person [2]
#>    person episode result   sup sup_flag sup_rank event event_final
#>     <dbl>   <dbl>  <dbl> <dbl>    <dbl>    <dbl> <dbl>       <dbl>
#>  1      1       1     NA    NA       NA       NA    NA          NA
#>  2      1       2     NA    NA       NA       NA    NA          NA
#>  3      2       1     NA    NA       NA       NA    NA          NA
#>  4      2       2      1     0       NA       NA    NA           0
#>  5      2       3     NA    NA       NA       NA    NA          NA
#>  6      2       4      2     1        1        1     1           1
#>  7      2       5     NA    NA       NA       NA    NA          NA
#>  8      2       6      2     1        1        2     0           0
#>  9      2       7     NA    NA       NA       NA    NA          NA
#> 10      2       8      2     1        1        3     0           0

reprex package (v2.0.0) 創建於 2022-04-22

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM