如何在 R 中聚合包含 NA 值的行

Question

I would like to go from this:我想从这个 go ：

City   State    x1  x2  x3
 
NA        CA    10  10  10

SD        CA    10  10  10 

NA        CA    10  10  10

SF        CA    10  10  10

FW        TX    5   5   5   

NA        TX    5   5   5

NA        TX    5   5   5

To This:对此：

State   sum 

CA      120

TX      45

col1 <- c(NA,'SD',NA,'SF','FW', NA, NA)
col2 <- c('CA', 'CA', 'CA', 'CA', 'TX', 'TX', 'TX')
col3 <- c(10,10,10,10,5,5,5)
col4 <- c(10,10,10,10,5,5,5)
col5 <- c(10,10,10,10,5,5,5)

df <- data.frame(City=col1, State=col2, x1=col3, x2=col4,x3=col5)
col6 <- c('CA', 'TX')
col7 <- c(120, 45)

solution <- data.frame(State=col6, sum=col7)

edit: fixed error in data frame.编辑：修复了数据框中的错误。 and change 'NA' to NA.并将“NA”更改为 NA。 Thank you to Ronak for replying so quickly.感谢 Ronak 如此迅速地回复。

Answer 1

@Ronak Shah solution is way better, but here is another longer but still effective solution to get to know some useful functions for future's sake: @Ronak Shah 解决方案要好得多，但这是另一个更长但仍然有效的解决方案，可以为将来了解一些有用的功能：

library(dplyr)

df %>%
  group_by(State) %>%
  summarise(across(x1:x3, ~ sum(.x, na.rm = TRUE))) %>%   # We use across() for column-wise operations
  rowwise() %>%
  mutate(sum = sum(c_across(x1:x3), na.rm = TRUE)) %>%    # We use rowwise() function + c_across() for row wise operations
  select(-c(x1:x3))

# A tibble: 2 x 2
# Rowwise: 
  State   sum
  <chr> <int>
1 CA      120
2 TX       45

This is also very useful and closer to the one mentioned above:这也非常有用，并且更接近于上面提到的那个：

df %>%
  group_by(State) %>%
  summarise(sum = sum(c_across(x1:x3), na.rm = TRUE))

# A tibble: 2 x 2
  State   sum
  <chr> <int>
1 CA      120
2 TX       45

Answer 2

You can subset the columns to sum from cur_data() in dplyr .您可以对 dplyr 中的dplyr cur_data()中的列进行子集化。

library(dplyr)

df %>%
  group_by(State) %>%
  summarise(sum = sum(select(cur_data(), x1:x3), na.rm = TRUE))

#  State   sum
#  <chr> <int>
#1 CA      120
#2 TX       45

data数据

df <- structure(list(City = c(NA, "SD", NA, "SF", "FW", NA, NA), State = c("CA", 
"CA", "CA", "CA", "TX", "TX", "TX"), x1 = c(10L, 10L, 10L, 10L, 
5L, 5L, 5L), x2 = c(10L, 10L, 10L, 10L, 5L, 5L, 5L), x3 = c(10L, 
10L, 10L, 10L, 5L, 5L, 5L)), class = "data.frame", row.names = c(NA, -7L))

Answer 3

We can use data.table methods for efficiency.我们可以使用data.table方法来提高效率。 Convert the data.frame to 'data.table ( setDT(df) ), grouped by 'State, specify the columns as a pattern of column names in .SDcols , get the rowSums of the Subset of Data.table ( .SD ) and sum it将 data.frame 转换为 'data.table ( setDT(df) )，按 'State 分组，将列指定为.SDcols中的列名pattern ，获取rowSums ( .SD ) 的子集的 rowSums 并sum

library(data.table)
setDT(df)[ , sum(rowSums(.SD), na.rm = TRUE), State, 
     .SDcols = patterns('^x\\d+$')]
#   State  V1
#1:    CA 120
#2:    TX  45

data数据

df <- structure(list(City = c(NA, "SD", NA, "SF", "FW", NA, NA), State = c("CA", 
"CA", "CA", "CA", "TX", "TX", "TX"), x1 = c(10L, 10L, 10L, 10L, 
5L, 5L, 5L), x2 = c(10L, 10L, 10L, 10L, 5L, 5L, 5L), x3 = c(10L, 
10L, 10L, 10L, 5L, 5L, 5L)), class = "data.frame",
   row.names = c(NA, -7L))

如何在 R 中聚合包含 NA 值的行

问题描述

3 个解决方案

解决方案1
3 2021-04-17 10:45:54

解决方案2
1 已采纳 2021-04-17 10:15:35

解决方案3
1 2021-04-17 17:26:08

data数据

如何在 R 中聚合包含 NA 值的行

问题描述

3 个解决方案

解决方案1 3 2021-04-17 10:45:54

解决方案2 1 已采纳 2021-04-17 10:15:35

解决方案3 1 2021-04-17 17:26:08

data数据

解决方案1
3 2021-04-17 10:45:54

解决方案2
1 已采纳 2021-04-17 10:15:35

解决方案3
1 2021-04-17 17:26:08