[英]complex multiple csv merging in r
We have two types of dataframes DF_1 and DF_2 each read from csv files with file names of the types 我们有两种类型的数据帧DF_1和DF_2,每个数据帧都从具有类型文件名的csv文件中读取
1) DF_1 csv file name of randomNumbers-text1.csv for. 1)DF_1 csv文件名为randomNumbers-text1.csv。
2) DF_2 csv file name of randomNumbers-text2.csv for. 2)DF_2 csv文件名为randomNumbers-text2.csv。
They are perfectly merged into a single dataframe called merged_DF by 它们完美地合并为一个名为merged_DF的数据帧
merged_DF = merge(DF_1, DF_2)
Now comes the tricky part. 现在是棘手的部分。
The working directory is made up of around 13000 csv files where half is of the DF_1 csv file name type and the other of the DF_2 csv file type (see point 1 & 2 above). 工作目录由大约13000个csv文件组成,其中一半是DF_1 csv文件名类型,另一个是DF_2 csv文件类型(参见上面的第1和第2点)。
Problem: How does one perform the above described perfectly merged operation on all approx. 问题:如何在所有约上执行上述完美合并操作。 13000 csv files and combine the output into a single combined dataframe call it combined_merged_DF? 13000个csv文件并将输出组合成一个组合的数据帧调用它combined_merged_DF?
How does one solve this the R way 如何通过R方式解决这个问题
Any help is much appreciated :) 任何帮助深表感谢 :)
Let's assume the files in the directory are 124-type1.csv
, 723-type1.csv
, 899-type1.csv
, 124-type2.csv
, 723-type2.csv
, 100-type2.csv
, wrong-file.csv
. 假设目录中的文件是124-type1.csv
, 723-type1.csv
, 899-type1.csv
, 124-type2.csv
, 723-type2.csv
, 100-type2.csv
, wrong-file.csv
。 You could do: 你可以这样做:
csv_files <- list.files("./path/to/csvs", ".csv$")
# [1] "124-type1.csv" "723-type1.csv" "899-type1.csv" "124-type2.csv"
# [5] "723-type2.csv" "100-type2.csv" "wrong-file.csv"
ids_in_common <- intersect(
sub("-type1\\.csv$", "", grep("type1", csv_files, value = TRUE)),
sub("-type2\\.csv$", "", grep("type2", csv_files, value = TRUE))
)
# [1] "124" "723"
do.call("rbind", lapply(ids_in_common, function(id) {
merge(
read.table(file.path("./path/to/csvs", paste0(id, "-type1.csv"))),
read.table(file.path("./path/to/csvs", paste0(id, "-type2.csv")))
)
}))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.