[英]R: Selecting Rows based on values in multiple columns
I've a data frame which have many columns with common prefix "_B" e,g '_B1', '_B2',...'_Bn'. 我有一个数据框,其中有许多列,它们的公用前缀为“ _B”,例如“ _B1”,“ _ B2”,...“ _ Bn”。 So that I can grab the column names by: 这样我就可以通过以下方式获取列名:
allB <- c(grep( "_B" , names( my.df ),value = TRUE ) )
I wish to select the rows for which each of these _B* columns passes a single condition like values >= some_cutoff 我希望选择每个_B *列都通过一个条件的行,例如值> = some_cutoff
Can someone tell how to do that, my efforts with 'all()' and 'any()' failed 有人可以告诉我该怎么做吗,我对“ all()”和“ any()”的尝试失败了
set.seed(12345)
my.df <- data.frame(a = round(rnorm(10,5),1), m_b1= round(rnorm(10,4),1),m_b2=round(rnorm(10,4),1))
allB <- c(grep( "_b" , names( my.df ),value = TRUE ) )
> my.df
a m_b1 m_b2
1 5.6 3.9 4.8
2 5.7 5.8 5.5
3 4.9 4.4 3.4
4 4.5 4.5 2.4
5 5.6 3.2 2.4
6 3.2 4.8 5.8
7 5.6 3.1 3.5
8 4.7 3.7 4.6
9 4.7 5.1 4.6
10 4.1 4.3 3.8
I wish to select rows for which every m_b1 and m_b2 column is >= 4.0 我希望选择每个m_b1和m_b2列均大于等于4.0的行
We could use filter_at
from dplyr
, and specify all_vars
(if all the values in the row meets the condition. If it is any of the value in the row, it would be any_vars
) 我们可以使用filter_at
的dplyr
,并指定all_vars
(如果该行中的所有值都满足条件。如果它是该行中的任何值, any_vars
)
library(dplyr)
my.df %>%
filter_at(allB, all_vars(. >= some_cutoff))
some_cutoff <- 3
my.df <- structure(list(`_B1` = c(1, 1, 9, 4, 10), `_B2` = c(2, 3, 12,
6, 12), V3 = c(3, 6, 13, 10, 13), V4 = c(4, 5, 16, 13, 18)), .Names = c("_B1",
"_B2", "V3", "V4"), row.names = c(NA, -5L), class = "data.frame")
allB <- grep( "_B" , names( my.df ),value = TRUE )
In base R
: 在基数R
:
some_cutoff = 4
selectedCols <- my.df[grep("_b", names(my.df), fixed = T)]
selectedRows <- selectedCols[apply(selectedCols, 1,
function(x) all(x>=some_cutoff)), ]
selectedRows
# m_b1 m_b2
# 2 5.8 5.5
# 6 4.8 5.8
# 9 5.1 4.6
grep()
is used to get the indices of columns with the pattern of interest, which is then used to subset my.df
. grep()
用于获取具有感兴趣模式的列的索引,然后将其用作my.df
子集。 apply()
iterates over rows when the second argument, MARGIN = 1
. 当第二个参数MARGIN = 1
时, apply()
遍历行。 The anonymous function returns TRUE
if all()
the entries match the condition. 如果all()
条目与条件匹配,则匿名函数返回TRUE
。 This logical vector is then used to subset selectedCols
. 然后,此逻辑向量用于子集selectedCols
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.