简体   繁体   中英

How to replace specific rows with their column sums in R?

I feel like there should be a simpler way of doing this. Here is my sample data.

df <- 
  tibble(
    group1 = c(1,1,2,2,3,3,3,3), 
    group2 = c("A", "B", "A", "B", "A", "B", "A", "B"), 
    vals = c(13,56,15,50,5,22,9,59)
  )
df
# A tibble: 8 x 3
  group1 group2  vals
   <dbl> <chr>  <dbl>
1      1 A         13
2      1 B         56
3      2 A         15
4      2 B         50
5      3 A          5
6      3 B         22
7      3 A          9
8      3 B         59

I want to combine the vals where group1 is 3 and replace the summed rows with the old ones. Can anyone come up with a cleaner/tidier solution than this?

df %>% 
  group_by(group1, group2) %>% 
  bind_rows(
    summarize(
      .[.$group1 == 3,], 
      across(vals, sum), 
      summed = "x"
    )
  ) %>% 
  ungroup() %>% 
  filter(!(group1 == 3 & is.na(summed))) %>% 
  select(-summed)

Here is what the result should be:

# A tibble: 6 x 3
  group1 group2  vals
   <dbl> <chr>  <dbl>
1      1 A         13
2      1 B         56
3      2 A         15
4      2 B         50
5      3 A         14
6      3 B         81

This isn't very efficient, but it gives you your intended output.

df %>%
  mutate(tmp = if_else(group1 == 3, 0L, row_number())) %>%
  group_by(tmp, group1, group2) %>%
  summarize(vals = sum(vals)) %>%
  ungroup() %>%
  select(-tmp)
# # A tibble: 6 x 3
#   group1 group2  vals
#    <dbl> <chr>  <dbl>
# 1      3 A         14
# 2      3 B         81
# 3      1 A         13
# 4      1 B         56
# 5      2 A         15
# 6      2 B         50

Another technique would be to split your data into "3" and "not 3", process the "3" data, then recombine them.

df3 <- filter(df, group1 == 3)
dfnot3 <- filter(df, group1 != 3)

df3 %>%
  group_by(group1, group2) %>%
  summarize(vals = sum(vals)) %>%
  ungroup() %>%
  bind_rows(dfnot3)
# # A tibble: 6 x 3
#   group1 group2  vals
#    <dbl> <chr>  <dbl>
# 1      3 A         14
# 2      3 B         81
# 3      1 A         13
# 4      1 B         56
# 5      2 A         15
# 6      2 B         50

(This second one is really only meaningful/efficient if you have lots of non- 3 rows.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM