Using a vector as a grep pattern

Question

I am new to R. I am trying to search the columns using grep multiple times within an apply loop. I use grep to specify which rows are summed based on the vector individuals

individuals <-c("ID1","ID2".....n)
bcdata_total <- sapply(individuals, function(x) {
  apply(bcdata_clean[,grep(individuals, colnames(bcdata_clean))], 1, sum)
})

bcdata is of random size and contains random data but contains columns that have individuals in part of the string

>head(bcdata)
  ID1-4 ID1-3 ID2-5
A   3     2    1
B   2     2    3
C   4     5    5

grep(individuals[1],colnames(bcdata_clean)) returns a vector that looks like [1] 1 2 , a list of the column names containing ID1 . That vector is used to select columns to be summed in bcdata_clean . This should occur n number of times depending on the length of individuals

However this returns the error

In grep(individuals, colnames(bcdata)) :
  argument 'pattern' has length > 1 and only the first element will be used

And results in all the columns of bcdata being identical

Ideally individuals would increment each time the function is run like this for each iteration

 apply(bcdata_clean[,grep(individuals[1,2....n], colnames(bcdata_clean))], 1, sum)

and would result in something like this

>head(bcdata_total)
  ID1 ID2
A  5   1
B  4   3 
C  9   5

But I'm not sure how to increment individuals . What is the best way to do this within the function?

Answer 1

You can use split.default to split data on similarly named columns and sum them row-wise.

sapply(split.default(df, sub('-.*', '', names(df))), rowSums, na.rm. = TRUE)

#  ID1 ID2
#A   5   1
#B   4   3
#C   9   5

data

df <- structure(list(`ID1-4` = c(3L, 2L, 4L), `ID1-3` = c(2L, 2L, 5L
), `ID2-5` = c(1L, 3L, 5L)), class = "data.frame", row.names = c("A", "B", "C"))

Answer 2

Passing individuals as my argument in function(x) fixed my issue

bcdata_total <- sapply(individuals, function(individuals) {
  apply(bcdata_clean[,grep(individuals, colnames(bcdata_clean))], 1, sum)
})

Answer 3

An option with tidyverse

library(dplyr)
library(tidyr)
library(tibble)
df %>%
    rownames_to_column('rn') %>%
    pivot_longer(cols = -rn, names_to = c(".value", "grp"), names_sep="-") %>%
    group_by(rn) %>% 
    summarise(across(starts_with('ID'), sum, na.rm = TRUE), .groups = 'drop') %>%
    column_to_rownames('rn')
#  ID1 ID2
#A   5   1
#B   4   3
#C   9   5

data

df <- df <- structure(list(`ID1-4` = c(3L, 2L, 4L), `ID1-3` = c(2L, 2L, 5L
), `ID2-5` = c(1L, 3L, 5L)), class = "data.frame", row.names = c("A", "B", "C"))

Using a vector as a grep pattern

Question

3 answers

solution1
0 2020-11-09 06:17:09

solution2
0 2020-11-09 06:42:02

solution3
0 2020-11-09 19:50:43

data

Using a vector as a grep pattern

Question

3 answers

solution1 0 2020-11-09 06:17:09

solution2 0 2020-11-09 06:42:02

solution3 0 2020-11-09 19:50:43

data

solution1
0 2020-11-09 06:17:09

solution2
0 2020-11-09 06:42:02

solution3
0 2020-11-09 19:50:43