[英]Using a vector as a grep pattern
I am new to R. I am trying to search the columns using grep
multiple times within an apply
loop.我是 R 新手。我试图在
apply
循环中多次使用grep
搜索列。 I use grep
to specify which rows are summed based on the vector individuals
我使用
grep
来指定根据向量individuals
对哪些行进行求和
individuals <-c("ID1","ID2".....n)
bcdata_total <- sapply(individuals, function(x) {
apply(bcdata_clean[,grep(individuals, colnames(bcdata_clean))], 1, sum)
})
bcdata
is of random size and contains random data but contains columns that have individuals
in part of the string bcdata
是随机大小并包含随机数据,但包含在字符串的一部分中包含individuals
列
>head(bcdata)
ID1-4 ID1-3 ID2-5
A 3 2 1
B 2 2 3
C 4 5 5
grep(individuals[1],colnames(bcdata_clean))
returns a vector that looks like [1] 1 2
, a list of the column names containing ID1
. grep(individuals[1],colnames(bcdata_clean))
返回一个看起来像[1] 1 2
的向量,一个包含ID1
的列名列表。 That vector is used to select columns to be summed in bcdata_clean
.该向量用于选择要在
bcdata_clean
求和的bcdata_clean
。 This should occur n
number of times depending on the length of individuals
这应该发生
n
次,具体取决于individuals
的长度
However this returns the error但是这会返回错误
In grep(individuals, colnames(bcdata)) :
argument 'pattern' has length > 1 and only the first element will be used
And results in all the columns of bcdata
being identical并导致
bcdata
所有列都相同
Ideally individuals
would increment each time the function is run like this for each iteration理想情况下,每次迭代运行函数时,
individuals
都会增加
apply(bcdata_clean[,grep(individuals[1,2....n], colnames(bcdata_clean))], 1, sum)
and would result in something like this并会导致这样的事情
>head(bcdata_total)
ID1 ID2
A 5 1
B 4 3
C 9 5
But I'm not sure how to increment individuals
.但我不确定如何增加
individuals
。 What is the best way to do this within the function?在函数中执行此操作的最佳方法是什么?
You can use split.default
to split data on similarly named columns and sum them row-wise.您可以使用
split.default
在名称相似的列上拆分数据并按行对它们求和。
sapply(split.default(df, sub('-.*', '', names(df))), rowSums, na.rm. = TRUE)
# ID1 ID2
#A 5 1
#B 4 3
#C 9 5
data数据
df <- structure(list(`ID1-4` = c(3L, 2L, 4L), `ID1-3` = c(2L, 2L, 5L
), `ID2-5` = c(1L, 3L, 5L)), class = "data.frame", row.names = c("A", "B", "C"))
Passing individuals
as my argument in function(x)
fixed my issue将
individuals
作为我在function(x)
参数解决了我的问题
bcdata_total <- sapply(individuals, function(individuals) {
apply(bcdata_clean[,grep(individuals, colnames(bcdata_clean))], 1, sum)
})
An option with tidyverse
tidyverse
一个选项
library(dplyr)
library(tidyr)
library(tibble)
df %>%
rownames_to_column('rn') %>%
pivot_longer(cols = -rn, names_to = c(".value", "grp"), names_sep="-") %>%
group_by(rn) %>%
summarise(across(starts_with('ID'), sum, na.rm = TRUE), .groups = 'drop') %>%
column_to_rownames('rn')
# ID1 ID2
#A 5 1
#B 4 3
#C 9 5
df <- df <- structure(list(`ID1-4` = c(3L, 2L, 4L), `ID1-3` = c(2L, 2L, 5L
), `ID2-5` = c(1L, 3L, 5L)), class = "data.frame", row.names = c("A", "B", "C"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.