简体   繁体   English

复杂的多个csv合并在r

[英]complex multiple csv merging in r

We have two types of dataframes DF_1 and DF_2 each read from csv files with file names of the types 我们有两种类型的数据帧DF_1和DF_2,每个数据帧都从具有类型文件名的csv文件中读取

1) DF_1 csv file name of randomNumbers-text1.csv for. 1)DF_1 csv文件名为randomNumbers-text1.csv。

2) DF_2 csv file name of randomNumbers-text2.csv for. 2)DF_2 csv文件名为randomNumbers-text2.csv。

They are perfectly merged into a single dataframe called merged_DF by 它们完美地合并为一个名为merged_DF的数据帧

merged_DF = merge(DF_1, DF_2)

Now comes the tricky part. 现在是棘手的部分。

The working directory is made up of around 13000 csv files where half is of the DF_1 csv file name type and the other of the DF_2 csv file type (see point 1 & 2 above). 工作目录由大约13000个csv文件组成,其中一半是DF_1 csv文件名类型,另一个是DF_2 csv文件类型(参见上面的第1和第2点)。

Problem: How does one perform the above described perfectly merged operation on all approx. 问题:如何在所有约上执行上述完美合并操作。 13000 csv files and combine the output into a single combined dataframe call it combined_merged_DF? 13000个csv文件并将输出组合成一个组合的数据帧调用它combined_merged_DF?

How does one solve this the R way 如何通过R方式解决这个问题

Any help is much appreciated :) 任何帮助深表感谢 :)

Let's assume the files in the directory are 124-type1.csv , 723-type1.csv , 899-type1.csv , 124-type2.csv , 723-type2.csv , 100-type2.csv , wrong-file.csv . 假设目录中的文件是124-type1.csv723-type1.csv899-type1.csv124-type2.csv723-type2.csv100-type2.csvwrong-file.csv You could do: 你可以这样做:

csv_files <- list.files("./path/to/csvs", ".csv$")
# [1] "124-type1.csv"  "723-type1.csv"  "899-type1.csv"  "124-type2.csv" 
# [5] "723-type2.csv"  "100-type2.csv"  "wrong-file.csv"

ids_in_common <- intersect(
  sub("-type1\\.csv$", "", grep("type1", csv_files, value = TRUE)),
  sub("-type2\\.csv$", "", grep("type2", csv_files, value = TRUE))
)
# [1] "124" "723"
do.call("rbind", lapply(ids_in_common, function(id) {
  merge(
    read.table(file.path("./path/to/csvs", paste0(id, "-type1.csv"))),
    read.table(file.path("./path/to/csvs", paste0(id, "-type2.csv")))
  )
}))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM