简体   繁体   中英

How to create a new output file in R if a file with that name already exists?

I am trying to run an R-script file using windows task scheduler that runs it every two hours. What I am trying to do is gather some tweets through Twitter API and run a sentiment analysis that produces two graphs and saves it in a directory. The problem is, when the script is run again it replaces the already existing files with that name in the directory.

As an example, when I used the pdf("file") function, it ran fine for the first time as no file with that name already existED in the directory. Problem is I want the R-script to be running every other hour. So, I need some solution that creates a new file in the directory instead of replacing that file. Just like what happens when a file is downloaded multiple times from Google Chrome.

I'd just time-stamp the file name.

> filename = paste("output-",now(),sep="")
> filename
[1] "output-2014-08-21 16:02:45"

Use any of the standard date formatting functions to customise to taste - maybe you don't want spaces and colons in your file names:

> filename = paste("output-",format(Sys.time(), "%a-%b-%d-%H-%M-%S-%Y"),sep="")
> filename
[1] "output-Thu-Aug-21-16-03-30-2014"

If you want the behaviour of adding a number to the file name, then something like this:

serialNext = function(prefix){
    if(!file.exists(prefix)){return(prefix)}
    i=1
    repeat {
       f = paste(prefix,i,sep=".")
       if(!file.exists(f)){return(f)}
       i=i+1
     }
  }

Usage. First, "foo" doesn't exist, so it returns "foo":

> serialNext("foo")
[1] "foo"

Write a file called "foo":

> cat("fnord",file="foo")

Now it returns "foo.1":

> serialNext("foo")
[1] "foo.1"

Create that, then it returns "foo.2" and so on...

> cat("fnord",file="foo.1")
> serialNext("foo")
[1] "foo.2"

This kind of thing can break if more than one process might be writing a new file though - if both processes check at the same time there's a window of opportunity where both processes don't see "foo.2" and think they can both create it. The same thing will happen with timestamps if you have two processes trying to write new files at the same time.

Both these issues can be resolved by generating a random UUID and pasting that on the filename, otherwise you need something that's atomic at the operating system level.

But for a twice-hourly job I reckon a timestamp down to minutes is probably enough.

See ?files for file manipulation functions. You can check if file exists with file.exists , and then either rename the existing file, or create a different name for the new one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM