[英]Aggregating columns based on columns name in R

I have this dataframe in R我在 R 中有这个数据框

Party Pro2005 Anti2005 Pro2006 Anti2006 Pro2007 Anti2007
R       1       18       0        7       2       13   
R       1       19       0        7       1       14   

D      13        7       3        4      10        5 
D      12        8       3        4       9        6  

I want to aggregate it to where it will combined all the pros and anti based on party我想将它聚合到它将结合所有基于派对的优点和反面的地方

for example例如

Party ProSum AntiSum
R.     234.   245
D.     234.   245

How would I do that in R?我将如何在 R 中做到这一点?

I would suggest a tidyverse approach reshaping the data and the computing the sum of values:我建议采用tidyverse方法来重塑数据并计算值的总和:

df <- structure(list(Party = c("R", "R", "D", "D"), Pro2005 = c(1L, 
1L, 13L, 12L), Anti2005 = c(18L, 19L, 7L, 8L), Pro2006 = c(0L, 
0L, 3L, 3L), Anti2006 = c(7L, 7L, 4L, 4L), Pro2007 = c(2L, 1L, 
10L, 9L), Anti2007 = c(13L, 14L, 5L, 6L)), class = "data.frame", row.names = c(NA, 

The code:编码:

df %>% pivot_longer(cols = -1) %>%
  #Format strings
  mutate(name=gsub('\\d+','',name)) %>%
  group_by(Party,name) %>% summarise(value=sum(value,na.rm=T)) %>%
  pivot_wider(names_from = name,values_from=value)

The output:输出:

# A tibble: 2 x 3
# Groups:   Party [2]
  Party  Anti   Pro
  <chr> <int> <int>
1 D        34    50
2 R        78     5

Splitting by parties and loop sum over the pro/anti using sapply , finally rbind . by各方拆分并使用sapply对 pro/anti 进行循环sum ,最后使用rbind

res <- data.frame(Party=sort(unique(d$Party)), do.call(rbind, by(d, d$Party, function(x) 
  sapply(c("Pro", "Anti"), function(y) sum(x[grep(y, names(x))])))))
#   Party Pro Anti
# D     D  50   34
# R     R   5   78

An outer solution is also suitable. outer解决方案也是合适的。

t(outer(c("Pro", "Anti"), c("R", "D"), 
      Vectorize(function(x, y) sum(d[d$Party %in% y, grep(x, names(d))]))))
#      [,1] [,2]
# [1,]    5   78
# [2,]   50   34


d <- read.table(header=T, text="Party Pro2005 Anti2005 Pro2006 Anti2006 Pro2007 Anti2007
R       1       18       0        7       2       13   
R       1       19       0        7       1       14   

D      13        7       3        4      10        5 
D      12        8       3        4       9        6  ")

You can use:您可以使用:

df %>% 
               names_to = c(".value", NA),
               names_pattern = "([a-zA-Z]*)([0-9]*)") %>% 
  group_by(Party) %>% 
  summarise(across(where(is.numeric), sum, na.rm = T))

# A tibble: 2 x 3
  Party   Pro  Anti
  <chr> <int> <int>
1 D        50    34
2 R         5    78

