简体   繁体   中英

How to read in multiple txt files in R with differing number of columns

I am trying to read in multiply txt files with differing columns in R. I already saw How can you read a CSV file in R with different number of columns and tried it with colClasses and also col.names (both with fill=T ). But it does not work. col.names gives me invalid value for quotatines (whatever that is).

ep_dir <- "C:/Users/J/Desktop/e_prot_unicode"

reading and merging data

*txt

# reading the data. empty list that gets filled up

ep_ldf<-list()

# creates a list of all the files in the directory with ending .txt

listtxt_ep<-list.files(path = ep_dir, pattern="*.txt", full.names = T) 

# loop for reading all the files in the list

for(m in 1:length(listtxt_ep)){
  ep_ldf[[m]]<-read.table(listtxt_ep[m],fill=T,header=T,sep = "\t",stringsAsFactors=F,fileEncoding = "UTF-16LE",dec = ",",colClasses = c("numeric", rep("character", 41)))
  ep <- bind_rows(ep_ldf,ep_ldf[[m]])
}

#another try because it is not working properly

f_ep = "C:/Users/J/Desktop/e_prot_unicode/22WS.U1"

#reading and merging the files, data.table is then called d_ep

d_ep = data.frame()
for(f_ep in listtxt_ep){
  tmp_ep <- read.delim(f_ep,row.names = NULL,sep = "\t",fileEncoding="UTF-16LE",fill = T,header = T,dec = ",",col.names = "V",seq_len(41)) %>% as.data.frame(stringsAsFactors = F)
  d_ep <- rbind.fill(d_ep, tmp_ep) 
}

How to read in multiple files with differing column number into R?

Not sure I totally follow what you are trying to do, but dplyr's bind_rows() function allows you to combine dataframes with different columns. You can also pass a list of dataframes to bind_rows() at it will combine all of them at once which will simplify some of your code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM