简体   繁体   中英

dplyr: group_by and which

I'd like to recover for each user of a data set the sum of the "value" if the "flag" is lower than 5.

I could use ifelse instead of which but I don't understand why this code doesn't work:

df <- data.frame(
  user_id = c(1, 1, 1, 2, 2, 2),
     flag = c(2,5, 3, 1, 2, 7),
    value = c(20, 10, 4, 3, 2, 2) 
)
df

library(dplyr)
 df2 =
   df %>%
   group_by(user_id) %>%
   mutate(variable1 = sum(.$value[which(.$flag<5)]),
          variable2 = sum(.$value[which(.$flag<10)])) %>%
   ungroup()

Error in .$c(20, 10, 4) : invalid subscript type 'double'

You don't need .$

 df %>%
    group_by(user_id) %>% 
    mutate(variable1= sum(value[flag<5]), variable2 = sum(value[flag<10]))
#    user_id flag value variable1 variable2
#1       1    2    20        24        34
#2       1    5    10        24        34
#3       1    3     4        24        34
#4       2    1     3         5         7
#5       2    2     2         5         7
#6       2    7     2         5         7

If there are multiple variables, you can use mutate_each

df$value2 <- c(22,12,7,5,2,1)

df %>%
   group_by(user_id) %>% 
   mutate_each(funs(variable1=sum(.[flag<5]), variable2=sum(.[flag<10])),
         starts_with('value')) 

Here is one case where we get different results by using which or not using it.

 df$flag[1:3] <- NA
 df %>% 
    group_by(user_id) %>%
    mutate(variable1 = sum(value[which(flag <5)]))
 #  user_id flag value variable1
 #1       1   NA    20         0
 #2       1   NA    10         0
 #3       1   NA     4         0
 #4       2    1     3         5
 #5       2    2     2         5
 #6       2    7     2         5

Without the which

 df %>%
     group_by(user_id) %>%
     mutate(variable1 = sum(value[flag <5]))
 #  user_id flag value variable1
 #1       1   NA    20        NA
 #2       1   NA    10        NA
 #3       1   NA     4        NA
 #4       2    1     3         5
 #5       2    2     2         5
 #6       2    7     2         5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM