简体   繁体   中英

group by in R dplyr for more than one variable on unique value of other variable

I have a dataset with three columns as below:

data <- data.frame(
   grpA = c(1,1,1,1,1,2,2,2),
   idB = c(1,1,2,2,3,4,5,6),
   valueC = c(10,10,20,20,10,30,40,50),
   otherD = c(1,2,3,4,5,6,7,8)
)

valueC is unique to each unique value of idB . I want to use dplyr pipe (as the rest of my code is in dplyr ) and use group_by on grpA to get a new column with sum of valueC values for each group. The answer should be like: newCol <- c(40,40,40,40,40,120,120,120)

but with data %>% group_by(grpA) %>% mutate(newCol=sum(valueC) , I get newCol <- c(70,70,70,70,70,120,120,120)

How do I include unique value of idB ? Is there anything else I can use instead of group_by in dplyr %>% pipe. I cant use summarise as I need to keep values in otherD intact for later use. Other option I have is to create newCol separately through sql and then merge with left join . But I am looking for a better solution inline. If it has been answered before, please refer me to the link as I could not find any relevant answer to this issue.

We need unique with match

data %>% 
   group_by(grpA) %>% 
   mutate(ind = sum(valueC[match(unique(idB), idB)]))
# A tibble: 8 x 5
# Groups:   grpA [2]
#   grpA   idB valueC otherD   ind
#  <dbl> <dbl>  <dbl>  <dbl> <dbl>
#1     1     1     10      1    40
#2     1     1     10      2    40
#3     1     2     20      3    40
#4     1     2     20      4    40
#5     1     3     10      5    40
#6     2     4     30      6   120
#7     2     5     40      7   120
#8     2     6     50      8   120

Or another option is to get the distinct rows by 'grpA', 'idB', grouped by 'grpA', get the sum of 'valueC' and left_join with the original data

data %>% 
     distinct(grpA, idB, .keep_all = TRUE) %>%
     group_by(grpA) %>%
     summarise(newCol = sum(valueC)) %>%
     left_join(data, ., by = 'grpA')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM