简体   繁体   English

如何有效读取和整理大型CSV文件?

[英]How to read and rbind large CSV file efficiently?

I have 20 large CSV (100-150MB each) files i would like to load in R and rbind them in a large file and perform my analysis. 我有20个大型CSV文件(每个100-150MB),我想在R中加载并将它们装进一个大文件中并进行分析。 Reading each CSV file is performed on one core only and takes about 7 min. 读取每个CSV文件仅在一个内核上执行,大约需要7分钟。 I am on 64bit 8-core linux with 16gb RAM so resources should not be an issue. 我在具有16gb RAM的64位8核linux上,因此资源应该不是问题。

Is there any way to perform this process more efficiently? 有什么方法可以更有效地执行此过程? I am also open to other (open source linux) software (for example binding the CSV files in a different programm and loading in R) or anything that could make this process faster. 我也对其他(开放源代码Linux)软件(例如,将CSV文件绑定到其他程序中并在R中加载)或任何可以使此过程更快进行的操作持开放态度。

Thank you very much 非常感谢你

Maybe you want a function like paste . 也许您想要paste类的功能。 It's a bash function that merge lines of files. 这是一个bash函数,用于合并文件行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM