[英]How to additively merge columns in a dataframe with similar column names?
我有一個大型數據框,其中有幾列需要基於字符串的第一部分(.S *之前)進行加法合並...
可以使用此代碼生成示例數據幀
DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)),
.Names = c("taxonomy", "A.S595", "B.S596", "B.S487"),
row.names = c(NA, -6L), class = "data.frame")
該文件如下所示:
taxonomy A.S595 B.S596 B.S487
1 cat 0 2 0
2 dog 5 1 0
3 horse 3 0 0
4 mouse 0 0 4
5 frog 0 2 4
6 lion 0 0 2
我希望輸出看起來像這樣
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
一種選擇是根據整數列的名稱split
數據集,遍歷list
,獲取rowSums
並與第一列cbind
cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
# taxonomy A B
#1 cat 0 2
#2 dog 5 1
#3 horse 3 0
#4 mouse 0 4
#5 frog 0 6
#6 lion 0 2
或使用tidyverse
library(tidyverse)
rownames_to_column(DF1) %>%
gather(key, val, -taxonomy, -rowname) %>%
separate(key, into = c('key1', 'key2')) %>%
group_by(rowname, key1) %>%
summarise(val = sum(val)) %>%
spread(key1, val) %>%
ungroup %>%
select(-rowname) %>%
bind_cols(DF1[1], .)
使用tidyverse
另一個版本:
DF1 %>%
select(matches("^B\\.S.*")) %>%
rowSums %>%
bind_cols(
select(DF1, -matches("^B\\.S.*")),
B = .
) %>%
rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.