简体   繁体   中英

for loop to sum data group and arrange it then break and print the value while ignore NA

How can i use for loop to sum data by group then break and print the value accumulated sum of A and B respectively? ie:

Type value
A  2
A  NA
A  13  15
B  565
B  245
B  578  1388
library(dplyr)
df %>% 
  group_by(Type) %>% 
  mutate(cs = cumsum(value, na.rm = True)) 

but it only shows the whole table and originally should be sum A should be 15 but eventually become NA.

Type value cs
A     2    2
A     NA   NA
A     13   NA
B     565  565
B     245  810
B     578  1388

Using dplyr you can try

library(dplyr)
df %>% 
  group_by(Type) %>%
    mutate(cs = last(sum(value, na.rm = TRUE))) %>%
      mutate(id = row_number()) %>% # Creating a dummy id column
        mutate(cs= replace(cs, id!= max(id),NA)) %>% # replace all rows of cs that are not the last within group Type
          select(-id) # removing id column
#Output
# A tibble: 6 x 3
# Groups:   Type [2]
  Type  value    cs
  <chr> <int> <int>
1 A         2    NA
2 A        NA    NA
3 A        13    15
4 B       565    NA
5 B       245    NA
6 B       578  1388

If I understand correctly, the OP expects that all rows of the new column cs are blank except for the last row of each group where the sum of the values belonging to the group should be printed.

A blank row is only possible if the new column cs is of type character . In case cs is expected to be of numeric type then there is no other choice to print either 0 , NA , or any other numeric value, but not "" (empty string).

So, below there are suggestions to create a character column either by using

  • ifelse() , or
  • replace() and rep() , or
  • c() and rep() .

in data.table and dplyr syntax, resp.

Note that no for loop is required at all.

data.table

library(data.table)
setDT(df)[, cs := fifelse(1:.N == .N, as.character(sum(value, na.rm = TRUE)), ""), by = Type][]

or

setDT(df)[, cs := replace(rep("", .N), .N, sum(value, na.rm = TRUE)), by = Type][]

or

setDT(df)[, cs := c(rep("", .N - 1L), sum(value, na.rm = TRUE)), by = Type][]
 Type value cs 1: A 2 2: A NA 3: A 13 15 4: B 565 5: B 245 6: B 578 1388

dplyr

library(dplyr)
df %>% 
  group_by(Type) %>% 
  mutate(cs = ifelse(row_number() == n()), sum(value, na.rm = TRUE), ""))

or

df %>% 
  group_by(Type) %>% 
  mutate(cs = replace(rep("", n()), n(), sum(value, na.rm = TRUE)))

or

df %>% 
  group_by(Type) %>% 
  mutate(cs = c(rep("", n() - 1L), sum(value, na.rm = TRUE)))
 # A tibble: 6 x 3 # Groups: Type [2] Type value cs <chr> <int> <chr> 1 A 2 "" 2 A NA "" 3 A 13 "15" 4 B 565 "" 5 B 245 "" 6 B 578 "1388"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM