简体   繁体   中英

How to import most recent csv file into RStudio

I'm attempting to import the most recent .csv from my working directory into R. Adamant this method was working previously but appears to no longer be.

Each day a .csv file is outputted to my designated folder, from where I import it into RStudio for manipulation. There are 2 files in this folder currently.

Please see code and description as follows:

1) Following code retrieves names of all csv files in directory.

# find filenames of all .csvs in directory 
filenames <- Sys.glob("*.csv")

> filenames
[1] "February 26, 2018 at 03:59PM myfile.csv" "February 26, 2018 at 04:00PM myfile.csv"

2) Next step is to remove redundant info from filename string and just keep date info:

# remove redundant file info  
newdates <- sub("at.*", "", filenames)

> newdates
[1] "February 26, 2018 " "February 27, 2018 "

3) Then I Remove comma from date

# remove comma from date string 
newdates <- gsub('\\$|,', '', newdates)

> newdates
[1] "February 26 2018 " "February 27 2018 "

4) In this step I change the date format

# change to short date format
betterdate <- as.Date(newdates,format = "%B %d %Y")

> betterdate 
[1] "2018-02-26" "2018-02-27"

5) Then I set max(betterdate) as the latest file

# takes latest file name as most recent file 
latestfile <- max(betterDates)

> latestfile 
[1] "2018-02-27"

6) And finally I import this file

# import file with latest date 
 rawfile <- read.csv(file=latestfile, header=TRUE, sep=",")

As I say, previously this inelegant solution was working as designed, however after some weeks I now receive this error message.

Error in read.table(file = file, header = header, sep = sep, quote = quote, : 'file' must be a character string or connection

Is it possible to explain what the issue is and how I might go about this whole endeavour in a better way?

您可以使用which.max来获取最新日期的索引,并使用它来从filenames向量中检索文件filenames

rawfile <- read.csv(file=filenames[which.max(betterdates), header=TRUE, sep=",")

If you can trust the creation time tracked by the operating system:

data_files <- file.info(Sys.glob("*.csv"))
row.names(data_files)[which.max(data_files[["ctime"]])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM