简体   繁体   English

R读入文件的最新版本

[英]R read in the latest version of a file

I am currently reading in a file like the example below, is there a way to specify to read the latest version of the file ie if I have files saved as "Abroad v1.csv", "Abroad v2.csv" I would want it to take the latest which would be 2 in this case. 我目前正在读取类似以下示例的文件,有没有一种方法可以指定读取文件的最新版本,即如果我将文件另存为“ Abroad v1.csv”,“ Abroad v2.csv”,我会想要它在这种情况下取​​最新的2。

year <- "2015"
species <- "HOM"

root <- "Y:/Pelagic Work/FIN Data"
file <- "Abroad.csv"
ABR <- file.path(root, year, species, file)

If at all reasonable, it would be best to determine the "latest" version of a file by the data in the file.info output ( this post , also suggested by zx8754, is a good answer for that). 如果完全合理,最好通过file.info输出中的数据来确定文件的“最新”版本( 此帖子 ,也由zx8754提出,是一个很好的答案)。

If you must do it by filename, be very careful and aware of how your operating system is going to sort characters. 如果必须按文件名进行操作,请非常小心并了解操作系统如何对字符进行排序。 Take, for example, this example. 以这个例子为例。

files <- paste0("somepath/directory/filename v", 1:10, ".csv")

basenames <- basename(files)

sort(basenames)

 [1] "filename v1.csv"  "filename v10.csv" "filename v2.csv"  "filename v3.csv"  "filename v4.csv" 
 [6] "filename v5.csv"  "filename v6.csv"  "filename v7.csv"  "filename v8.csv"  "filename v9.csv"

As you can see, filename v10.csv' appears in the second position and would not be picked up by simple methods such at tail(basenames, 1)`. 如您所见, filename v10.csv' appears in the second position and would not be picked up by simple methods such at tail(basenames,1) filename v10.csv' appears in the second position and would not be picked up by simple methods such at Instead, you need to strip out all of the characters except for those that specify the order of the versions, convert to the correct format, then sort. 相反,您需要除去所有指定版本顺序的字符,然后转换为正确的格式,然后再排序。 Here's an example of how to do this with integer versions as you've suggested you have. 这是一个如您所建议的如何使用整数版本执行此操作的示例。

Files <- data.frame(path = dirname(files),
                    file = basename(files),
                    stringsAsFactors = FALSE)
Files$v_number <- gsub("(^filename v|[.]csv$)", "", basenames)
Files$v_number <- as.numeric(Files$v_number)

Files <- Files[order(Files$v_number), , drop = FALSE]
tail(Files, 1)

This, however, is clunky and error prone. 但是,这很笨拙且容易出错。 If at all possible, I'd recommend transitioning to a database, or version control, or both. 如果可能的话,建议您过渡到数据库或版本控制,或同时过渡到两者。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM