简体   繁体   中英

How can I select the column using the column name in a different dataframe?

I want to select the column(or make a subset) using the column name in a different dataframe.

eg

A dataframe: 3 columns

  • each column name: ab, cd, de

B dataframe: 10 columns

  • each column name: n_ab, n_cd_e, n_de, ab, fg, n_ef, tt, yy, zz, n_a2

I want to make the subset of the B dataframe.

  1. subset C dataframe n_ab, n_cd_e, n_de, ab

  2. subset D dataframe ab

How can I make C and D dataframe?

I expected that I could make the subset B using this code. but, I couldn't. Because the contains() only can make the subset by letter.

3) How can select the column(or make the subset) using the condition(like >=, %in%, == etc.)?

ge<-select(ge.n, contains('ge'))

Thanks

To create C you can use grepl with an OR pattern derived from the elements on A 's names.

C = B[, grepl(paste0(names(A), collapse="|"), names(B)), drop=F]

To create D you can use %in% directly.

D = B[, names(B) %in% names(A), drop=F]

Outputs (C and D, respectively):

        n_ab     n_cd_e       n_de         ab
1 -0.4456620  0.4007715  1.7869131  0.7013559
2  1.2240818  0.1106827  0.4978505 -0.4727914
3  0.3598138 -0.5558411 -1.9666172 -1.0678237


          ab
1  0.7013559
2 -0.4727914
3 -1.0678237

Inputs:

set.seed(123)
A = setNames(as.data.frame(
  replicate(3,rnorm(3))),  c("ab","cd","de")
)
B = setNames(as.data.frame(
  replicate(10,rnorm(3))),  c("n_ab", "n_cd_e", "n_de", "ab", "fg", "n_ef", "tt", "yy", "zz", "n_a2")
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM