简体   繁体   中英

Loading Multiple .txt files where columns are separated by | character in R

I am attempting to load multiple text files into R and in each of the files, the columns are divided using the "|"character.

To give a sense of what the file structure looks like, a given row will look like:

congression printer|-182.431552949032

In this file I want to separate the congressional printer string from the numerical characters.

When using the following code:

folder <- '~/filepath'
file_list <- list.files(path=folder, pattern="*.txt")
data <- 
 do.call('rbind',
         lapply(file_list,
                function(x)
                  read.table(paste(folder, x, sep= ""),
                             header = TRUE, row.names = NULL)))

It'll load in the data as:

    [1]         [2]
congression  printer|-182.431552949032

Is there away to correct this later using the tidyr::separate() function or by hedging the problem in the beginning? When trying to just put sep ="|" in the code above, that just impacts how my text files are found so that doesn't really work.

Things are always easier (and more powerful) with data.table :

library(data.table)
folder <- '~/filepath'
pathsList <- list.files(path=folder, pattern="*.txt", full.names = T)
rbindlist(lapply(pathsList, fread))

this works too:

folder <- '~/filepath'
file_list <- list.files(path=folder, pattern="*.txt")
data <- 
  do.call('rbind',
          lapply(file_list,
                 function(x)
                   read.table(paste0(folder, x), sep = "|",
                              header = TRUE, row.names = NULL)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM