简体   繁体   中英

R: How to subset columns based on values of the first row?

I want to make a subset of columns according to a certain value in the first row. Here an example:

df <- data.frame( region = c("A", sample(1:5,3)),
                  region = c("B", sample(1:5,3)),
                  region = c("C", sample(1:5,3)),
                  region = c("A", sample(1:5,3)) )

> df
  region region.1 region.2 region.3
1      A        B        C        A
2      5        5        3        3
3      2        1        5        4
4      4        2        1        5

I want to subset all columns that show an A in the first row. I can't do this using index numbers as I have more than 3000 columns in my dataset and the names of the colnames are also important thats why I'm using the first row as a second header. The result for this example should return:

  region  region.3
1      A         A
2      5         3
3      2         4
4      4         5 

And how can I avoid the automatic counting in the colnames for same names (region.1, region.2...)? Thanks for your ideas.

You can use index as in

> df[, df[1, ] == "A"]
  region region.3
1      A        A
2      3        1
3      2        5
4      1        4

Try using check.names=FALSE for your second question

> data.frame( region = c("A", sample(1:5,3)),
+             region = c("B", sample(1:5,3)),
+             region = c("C", sample(1:5,3)),
+             region = c("A", sample(1:5,3)), check.names=FALSE )
  region region region region
1      A      B      C      A
2      5      5      4      2
3      2      1      5      5
4      4      2      2      4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM