繁体   English   中英

如何以相似的列名累加合并数据框中的列?

[英]How to additively merge columns in a dataframe with similar column names?

我有一个大型数据框,其中有几列需要基于字符串的第一部分(.S *之前)进行加法合并...

可以使用此代码生成示例数据帧

DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
                 A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)), 
            .Names = c("taxonomy", "A.S595", "B.S596", "B.S487"), 
            row.names = c(NA, -6L), class = "data.frame")

该文件如下所示:

  taxonomy A.S595 B.S596 B.S487
1      cat 0      2      0
2      dog 5      1      0
3    horse 3      0      0
4    mouse 0      0      4
5     frog 0      2      4
6     lion 0      0      2

我希望输出看起来像这样

  taxonomy A      B 
1      cat 0      2      
2      dog 5      1      
3    horse 3      0      
4    mouse 0      4      
5     frog 0      6      
6     lion 0      2  

一种选择是根据整数列的名称split数据集,遍历list ,获取rowSums并与第一列cbind

cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
#  taxonomy A B
#1      cat 0 2
#2      dog 5 1
#3    horse 3 0
#4    mouse 0 4
#5     frog 0 6
#6     lion 0 2

或使用tidyverse

library(tidyverse)
rownames_to_column(DF1) %>% 
   gather(key, val, -taxonomy, -rowname) %>%
   separate(key, into = c('key1', 'key2')) %>% 
   group_by(rowname, key1) %>% 
   summarise(val = sum(val)) %>% 
   spread(key1, val)  %>% 
   ungroup %>% 
   select(-rowname) %>% 
   bind_cols(DF1[1], .)

使用tidyverse另一个版本:

DF1 %>%
  select(matches("^B\\.S.*")) %>%
  rowSums %>%
  bind_cols(
    select(DF1, -matches("^B\\.S.*")),
    B = .
    ) %>%
  rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))

  taxonomy A B
1      cat 0 2
2      dog 5 1
3    horse 3 0
4    mouse 0 4
5     frog 0 6
6     lion 0 2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM