I have data of biological compounds levels of test patients, who are grouped into different groups depending on being administered certain drugs. That is, we have:
Coronin
, Dystrophin
, Tubulin
(randomly Googled protein names), and so on. So we have a tibble
like (all values in the tibble
are floats):
| compound | A1 | A2 | A3 | B1 ... C3|
|-----------|----|----|----|---- ... --|
| Coronin |
| Dystrophin|
| Gloverin |
| keratin |
| Tubulin |
For each compound, I wish to compute the means of each group, as a new column, like so:
| compound | A1 | A2 | A3 | B1 ...C3| mean_A | mean_B | mean_C |
|-----------|-----|-----|-----|---- ... --|---------|---------|---------|
| Coronin | 1 | 2 | 3 | ... | 2 | ... |
| Dystrophin| 4 | 5 | 6 | ... | 5 | ... |
| Gloverin | ...
| keratin |
| Tubulin |
The code to do this is:
my_tibble <- my_tibble %>%
mutate(mean_A = rowMeans(select(., c("A1", "A2", "A3")))) %>%
mutate(mean_B = rowMeans(select(., c("B1", "B2", "B3")))) %>%
mutate(mean_C = rowMeans(select(., c("C1", "C2", "C3"))))
The question is: I'd like to be able to this for a dynamically input number of groups, ie C, D, E, etc ...where column-to-group is a separate, user-input tibble in itself, say:
| group_name | name1 | name2 | name3 |
|------------|-------|-------|-------|
| A | A1 | B2 | C3 |
| B | B1 | B2 | C3 |
...
and so on
How might I iteratively add mutate
verbs, according to a user-specified number of groups (and associated sample-to-group names)?
Note: the group names "C", "B" ...etc are arbitrary (the groups are, for instance, likely to be assigned the name of the drug that that group was given), so I wouldn't use an iterative operation that relies on the fact that they are literally named "A", "B", etc.
An option would be to split by the column names, loop through the list
with sapply
, get the rowMeans
and assign it to 3 new columns
nm1 <- substr(names(df1)[-1], 1, nchar(names(df1)[-1])-1)
df1[paste0("mean_", toupper(unique(nm1)))] <-
sapply(split.default(df1[-1], nm1), rowMeans)
df1
# compound g11 g12 g13 g21 g22 g23 g31 g32 g33 mean_G1 mean_G2 mean_G3
#1 A 7 3 9 8 8 1 3 7 2 6.333333 5.666667 4.000000
#2 B 3 8 8 1 2 5 1 1 4 6.333333 2.666667 2.000000
#3 C 8 6 7 5 1 4 3 6 3 7.000000 3.333333 4.000000
#4 D 7 9 8 5 5 6 8 7 6 8.000000 5.333333 7.000000
#5 E 2 4 1 5 2 6 6 1 3 2.333333 4.333333 3.333333
NOTE: This can be extended to any number of groups. Only thing to change is the 1:3
in the current example for creating the column names
set.seed(24)
df1 <- cbind(compound = LETTERS[1:5], as.data.frame(matrix(sample(1:9, 5 * 9,
replace = TRUE), nrow = 5, ncol = 9, dimnames = list(NULL,
paste0(rep(paste0("g", 1:3), each = 3), 1:3)))))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.