簡體   English   中英

如何以相似的列名累加合並數據框中的列?

[英]How to additively merge columns in a dataframe with similar column names?

我有一個大型數據框,其中有幾列需要基於字符串的第一部分(.S *之前)進行加法合並...

可以使用此代碼生成示例數據幀

DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
                 A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)), 
            .Names = c("taxonomy", "A.S595", "B.S596", "B.S487"), 
            row.names = c(NA, -6L), class = "data.frame")

該文件如下所示:

  taxonomy A.S595 B.S596 B.S487
1      cat 0      2      0
2      dog 5      1      0
3    horse 3      0      0
4    mouse 0      0      4
5     frog 0      2      4
6     lion 0      0      2

我希望輸出看起來像這樣

  taxonomy A      B 
1      cat 0      2      
2      dog 5      1      
3    horse 3      0      
4    mouse 0      4      
5     frog 0      6      
6     lion 0      2  

一種選擇是根據整數列的名稱split數據集,遍歷list ,獲取rowSums並與第一列cbind

cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
#  taxonomy A B
#1      cat 0 2
#2      dog 5 1
#3    horse 3 0
#4    mouse 0 4
#5     frog 0 6
#6     lion 0 2

或使用tidyverse

library(tidyverse)
rownames_to_column(DF1) %>% 
   gather(key, val, -taxonomy, -rowname) %>%
   separate(key, into = c('key1', 'key2')) %>% 
   group_by(rowname, key1) %>% 
   summarise(val = sum(val)) %>% 
   spread(key1, val)  %>% 
   ungroup %>% 
   select(-rowname) %>% 
   bind_cols(DF1[1], .)

使用tidyverse另一個版本:

DF1 %>%
  select(matches("^B\\.S.*")) %>%
  rowSums %>%
  bind_cols(
    select(DF1, -matches("^B\\.S.*")),
    B = .
    ) %>%
  rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))

  taxonomy A B
1      cat 0 2
2      dog 5 1
3    horse 3 0
4    mouse 0 4
5     frog 0 6
6     lion 0 2

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM