简体   繁体   中英

How to create a subset of a DataFrame by giving a condition regarding the column names

Sorry if it is a very simple question, I'm new at programming. I want to create a subset of a DataFrame (eclipse dataset) by using specific column names. However, since there are 212 columns in total, and I need 41 of them, writing every single of the column names as a list would be too long (and not a nice way to code I suppose). So instead I decided to get the columns by specifying the beginning of the column names (which decreases the list to 15 elements). I have column names that start with specific letters such as "NOF", "NOM", "NSF", etc. and I want to extract the columns starting with these strings to create my new subset. I tried to run the code below:

eclipse_train <- subset(eclipse, select = starts_with(predictors))

Where the predictors is a list of words that I want the columns to start with. But of course, it gave the error:

Error in starts_with(predictors): is_string(match) is not TRUE

I could not come up with anything else to filter the columns that start with specific strings I wanted to create a subset. How can I implement such a thing?

Assuming the eclipse data frame in the Note, use grep to find the indices of the names that start with the indicated strings and subscript by those indices. No packages are used.

eclipse[ grep("^(NOF|NOM|NSF)", names(eclipse)) ]

giving:

  NOFX NOMX NSFX
1    2    3    4

Note

If the desired columns were contiguous, as in the example in the Note, then this would also work where we specify the first and last name.

subset(eclipse, select = NOFX:NSFX)

giving the same result.

Note

nms <- c("A", paste0(c("NOF", "NOM", "NSF"), "X"), "B")
eclipse <- as.data.frame.list(setNames(seq_along(nms), nms))

which looks liek this:

> eclipse
  A NOFX NOMX NSFX B
1 1    2    3    4 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM