[英]R: How to subset columns based on values of the first row?
I want to make a subset of columns according to a certain value in the first row. 我想根据第一行中的某个值创建列的子集。 Here an example:
这里是一个例子:
df <- data.frame( region = c("A", sample(1:5,3)),
region = c("B", sample(1:5,3)),
region = c("C", sample(1:5,3)),
region = c("A", sample(1:5,3)) )
> df
region region.1 region.2 region.3
1 A B C A
2 5 5 3 3
3 2 1 5 4
4 4 2 1 5
I want to subset all columns that show an A
in the first row. 我想对在第一行中显示
A
所有列进行子集化。 I can't do this using index numbers as I have more than 3000 columns in my dataset and the names of the colnames are also important thats why I'm using the first row as a second header. 我无法使用索引号来执行此操作,因为我的数据集中有3000多个列,并且列名的名称也很重要,这就是为什么我将第一行用作第二个标头。 The result for this example should return:
此示例的结果应返回:
region region.3
1 A A
2 5 3
3 2 4
4 4 5
And how can I avoid the automatic counting in the colnames for same names (region.1, region.2...)? 以及如何避免自动计数相同名称(region.1,region.2 ...)的名称? Thanks for your ideas.
感谢您的想法。
You can use index as in 您可以像这样使用index
> df[, df[1, ] == "A"]
region region.3
1 A A
2 3 1
3 2 5
4 1 4
Try using check.names=FALSE
for your second question 尝试将
check.names=FALSE
用于第二个问题
> data.frame( region = c("A", sample(1:5,3)),
+ region = c("B", sample(1:5,3)),
+ region = c("C", sample(1:5,3)),
+ region = c("A", sample(1:5,3)), check.names=FALSE )
region region region region
1 A B C A
2 5 5 4 2
3 2 1 5 5
4 4 2 2 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.