简体   繁体   中英

How to select a subset of a dataframe using a variable dynamically

I have an R dataframe with 300 columns. I have done Principal Component Analysis and grabbed the top 110 columns that explain the variability of dataset. How do we pass the 110 column names list to an R function to select the subset of original dataframe containing only these columns?

Please see the code below.

t.df = read.xlsx('t_dataset.xlsx', 1,startRow=1 )
X=t.df[ , 3:307]
t.pca=PCA(X, graph=FALSE)
write.infile(t.pca$var$contrib, "pca.csv", sep=',')

t.df.pca = read.xlsx('pca1.xlsx', 1,startRow=1 )
t.df.pca.r=subset(t.df.pca, Dim.1>mean(Dim.1) | Dim.2>mean(Dim.2) | Dim.3>mean(Dim.3) | Dim.4>mean(Dim.4) | Dim.5>mean(Dim.5))

c1=c(t.df.pca.r$Column)

#c1 contains the list of 110 column names.

c2=cat(paste(shQuote(c1), collapse=", "))
print(c2)

output of print(c2): "funct", "pronoun", "ppron", "i", "we", "you", "shehe", "they", "ipron", "article", "verb", "auxverb", "past", "present", "future", "adverb", "conj", upto 110 variables

t.df.2=t.df[c(c2)]
nrow(t.df.2)
ncol(t.df.2)

t.df.4=t.df[c2]
nrow(t.df.4)
ncol(t.df.4)

t.df.5=t.df[ ,c2]
nrow(t.df.5)
ncol(t.df.5)

Above code returns the result as follows: [1] 45498 [1] 0

[1] 45498 [1] 0

[1] 45498 [1] 0

What I need is: Pass these column names to an R function and get a subset of the original dataframe t.df. This subset will contain only the 110 columns present in c1

How to do this?

Here's one way to do it, with an example data frame:

library(tidyverse)

df <-
  tibble(
    col1 = c(1, 2, 3),
    col2 = c(2, 3, 4),
    col3 = c(3, 4, 5)
  )

cols_to_keep <- c("col1", "col3")

df %>% 
  select(cols_to_keep)

I don't know what format your data is in, but as long as you get a vector of column names you should be able to use the select command.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM