[英]How to additively merge columns in a dataframe with similar column names?
我有一个大型数据框,其中有几列需要基于字符串的第一部分(.S *之前)进行加法合并...
可以使用此代码生成示例数据帧
DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)),
.Names = c("taxonomy", "A.S595", "B.S596", "B.S487"),
row.names = c(NA, -6L), class = "data.frame")
该文件如下所示:
taxonomy A.S595 B.S596 B.S487
1 cat 0 2 0
2 dog 5 1 0
3 horse 3 0 0
4 mouse 0 0 4
5 frog 0 2 4
6 lion 0 0 2
我希望输出看起来像这样
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
一种选择是根据整数列的名称split
数据集,遍历list
,获取rowSums
并与第一列cbind
cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
# taxonomy A B
#1 cat 0 2
#2 dog 5 1
#3 horse 3 0
#4 mouse 0 4
#5 frog 0 6
#6 lion 0 2
或使用tidyverse
library(tidyverse)
rownames_to_column(DF1) %>%
gather(key, val, -taxonomy, -rowname) %>%
separate(key, into = c('key1', 'key2')) %>%
group_by(rowname, key1) %>%
summarise(val = sum(val)) %>%
spread(key1, val) %>%
ungroup %>%
select(-rowname) %>%
bind_cols(DF1[1], .)
使用tidyverse
另一个版本:
DF1 %>%
select(matches("^B\\.S.*")) %>%
rowSums %>%
bind_cols(
select(DF1, -matches("^B\\.S.*")),
B = .
) %>%
rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.