如果已存在具有该名称的文件，如何在R中创建新的输出文件？

Question

I am trying to run an R-script file using windows task scheduler that runs it every two hours. 我试图使用Windows任务调度程序运行一个R脚本文件，每两个小时运行一次。 What I am trying to do is gather some tweets through Twitter API and run a sentiment analysis that produces two graphs and saves it in a directory. 我想要做的是通过Twitter API收集一些推文并运行情绪分析，生成两个图并将其保存在目录中。 The problem is, when the script is run again it replaces the already existing files with that name in the directory. 问题是，当脚本再次运行时，它将替换目录中具有该名称的现有文件。

As an example, when I used the pdf("file") function, it ran fine for the first time as no file with that name already existED in the directory. 例如，当我使用pdf（“file”）函数时，它第一次运行正常，因为目录中没有已存在具有该名称的文件。 Problem is I want the R-script to be running every other hour. 问题是我希望R-script每隔一小时运行一次。 So, I need some solution that creates a new file in the directory instead of replacing that file. 所以，我需要一些解决方案，在目录中创建一个新文件，而不是替换该文件。 Just like what happens when a file is downloaded multiple times from Google Chrome. 就像从谷歌浏览器多次下载文件时发生的情况一样。

Answer 1

I'd just time-stamp the file name. 我只是给文件名加盖时间戳。

> filename = paste("output-",now(),sep="")
> filename
[1] "output-2014-08-21 16:02:45"

Use any of the standard date formatting functions to customise to taste - maybe you don't want spaces and colons in your file names: 使用任何标准日期格式化函数来自定义 - 可能您不希望文件名中包含空格和冒号：

> filename = paste("output-",format(Sys.time(), "%a-%b-%d-%H-%M-%S-%Y"),sep="")
> filename
[1] "output-Thu-Aug-21-16-03-30-2014"

If you want the behaviour of adding a number to the file name, then something like this: 如果您想要在文件名中添加数字的行为，那么这样的事情：

serialNext = function(prefix){
    if(!file.exists(prefix)){return(prefix)}
    i=1
    repeat {
       f = paste(prefix,i,sep=".")
       if(!file.exists(f)){return(f)}
       i=i+1
     }
  }

Usage. 用法。 First, "foo" doesn't exist, so it returns "foo": 首先，“foo”不存在，因此返回“foo”：

> serialNext("foo")
[1] "foo"

Write a file called "foo": 写一个名为“foo”的文件：

> cat("fnord",file="foo")

Now it returns "foo.1": 现在它返回“ foo.1”：

> serialNext("foo")
[1] "foo.1"

Create that, then it returns "foo.2" and so on... 创建它，然后它返回“foo.2”，依此类推......

> cat("fnord",file="foo.1")
> serialNext("foo")
[1] "foo.2"

This kind of thing can break if more than one process might be writing a new file though - if both processes check at the same time there's a window of opportunity where both processes don't see "foo.2" and think they can both create it. 如果多个进程可能正在写一个新文件，这种事情可能会破坏 - 如果两个进程同时检查有一个机会窗口，其中两个进程都看不到“foo.2”并认为它们都可以创建它。 The same thing will happen with timestamps if you have two processes trying to write new files at the same time. 如果您有两个进程同时尝试写入新文件，那么时间戳也会发生同样的情况。

Both these issues can be resolved by generating a random UUID and pasting that on the filename, otherwise you need something that's atomic at the operating system level. 通过生成随机UUID并将其粘贴在文件名上，可以解决这两个问题，否则，您需要在操作系统级别上具有原子性的东西。

But for a twice-hourly job I reckon a timestamp down to minutes is probably enough. 但是对于每小时两次的工作，我认为将时间戳记缩短到几分钟可能就足够了。

Answer 2

See ?files for file manipulation functions. 有关文件操作功能，请参见?files 。 You can check if file exists with file.exists , and then either rename the existing file, or create a different name for the new one. 您可以使用file.exists检查文件是否存在，然后重命名现有文件，或为新文件创建其他名称。

如果已存在具有该名称的文件，如何在R中创建新的输出文件？

问题描述

2 个解决方案

解决方案1
6 已采纳 2014-08-21 15:04:02

解决方案2
1 2014-08-21 15:05:59

如果已存在具有该名称的文件，如何在R中创建新的输出文件？

问题描述

2 个解决方案

解决方案1 6 已采纳 2014-08-21 15:04:02

解决方案2 1 2014-08-21 15:05:59

解决方案1
6 已采纳 2014-08-21 15:04:02

解决方案2
1 2014-08-21 15:05:59