[英]How to select columns based on criteria in a certain row in R
I have a matrix of values with both row names and column names, as shown here. 我有一个包含行名和列名的值矩阵,如下所示。
C5.Outliers C5.Outliers
Days J1 J2 J3 J4
0.01 458 -160 -151 -52
0.02 459 -163 -154 -46
0.03 457 -165 -150 -51
Perc 0.99 0.04 0.00 0.52
I would like to create a separate matrix using only the columns for which the value for the row "Perc" is =<50.0. 我想仅使用“ Perc”行的值为= <50.0的列创建一个单独的矩阵。 In this example, I would be extracting columns J2 and J3.
在此示例中,我将提取列J2和J3。
This is the code I tried which isn't working (the "Perc" row is row #1414 on my matrix): C5.Final<-subset(C5.Outliers, 1414<.51) 这是我尝试的无效代码(“ Perc”行是矩阵上的第1414行):C5.Final <-subset(C5.Outliers,1414 <.51)
Presumably you meant <= 0.50
and not <= 50
since all "Perc" are less than 50. You can do 假定您的意思是
<= 0.50
而不是<= 50
因为所有“ Perc”均小于50。您可以
df[, unlist(df["Perc",]) <= 0.5]
# J2 J3
# 0.01 -160.00 -151
# 0.02 -163.00 -154
# 0.03 -165.00 -150
# Perc 0.04 0
But this may be safer and takes into account any NA
values that may appear in "Perc". 但这可能更安全,并考虑了“ Perc”中可能出现的所有
NA
值。
u <- unlist(df["Perc",]) <= 0.50
df[, u & !is.na(u)]
Also, you can speed it up if need be by adding use.names = FALSE
in unlist()
. 另外,如果需要,可以通过在
unlist()
添加use.names = FALSE
来加快速度。 And finally, if you have a matrix and not a data frame, then you can remove unlist()
all together. 最后,如果您有一个矩阵而不是一个数据框,则可以一起删除
unlist()
。
I assume you mean 0.50 since all the columns with the "Perc" are above 50.0. 我假设您的意思是0.50,因为所有带有“ Perc”的列均高于50.0。
this might not be the best way but it works: 这可能不是最好的方法,但是它可以工作:
#data:
df <- data.frame(Days=c(0.01,0.02,0.03,"Perc"),J1=c(458,459,457,0.99),
J2 =c(-165,-163,-160,0.04),J3=c(-151,-153,-131,0.00),J4=c(-52,-45,-51,0.52))
dfc <- subset(df,,select= which(c(TRUE,(df[which(df$Days == "Perc"), ] <= 0.50)[2:5])))
dfc
Days J2 J3
1 0.01 -165.00 -151
2 0.02 -163.00 -153
3 0.03 -160.00 -131
4 Perc 0.04 0
You can remove the TRUE,
if you dont want the df$Days
variable, change the 0.50
threshold if needed and expand the 2:5
if you have extra columns or even substitute the "Perc"
with 1414
if you so wish. 如果您不希望使用
df$Days
变量TRUE,
则可以删除TRUE,
如果需要,可以更改0.50
阈值,如果有多余的列,则将阈值扩展为2:5
如果愿意,甚至可以用1414
代替"Perc"
。
Hope this works. 希望这行得通。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.