简体   繁体   English

使用R将大型data.frame保存到PostgreSQL中

[英]Saving large data.frame into PostgreSQL with R

I'm saving really large data.frame (30 million rows) to PostgreSQL database from R and it kills my PC. 我正在从R中将大量data.frame(3000万行)保存到PostgreSQL数据库,它会杀死我的PC。 As this is a result of calculations produced by dplyr, I'd mind to use some build in functionality of this package, but copy_to doesn't work for such huge tables. 因为这是dplyr生成的计算结果,所以我想介绍一下这个包的一些内置功能,但是copy_to不适用于这么大的表。 Any suggestions? 有什么建议么?

Can you copy the dataframe to a csv or tab delimited text file, then load that into PostgreSQL with the COPY FROM command [1]? 您可以将数据帧复制到csv或制表符分隔的文本文件,然后使用COPY FROM命令[1]将其加载到PostgreSQL中吗? That implements a bulk load approach which may perform faster. 这实现了可以更快执行的批量加载方法。

In some cases, it may be possible to use an RScript to emit the data as a stream and pipe it directly into psql: 在某些情况下,可以使用RScript将数据作为流发出并将其直接传送到psql中:

<RScript output tab delmited rows> | psql -c "COPY <tablename> (columnlist, ...) FROM STDIN WITH (FORMAT text)"

In some long running cases, I put | 在一些长时间运行的情况下,我把| pv | pv | in the middle to track progress ( http://www.ivarch.com/programs/pv.shtml ). 在中间跟踪进度( http://www.ivarch.com/programs/pv.shtml )。

[1] http://www.postgresql.org/docs/current/interactive/sql-copy.html [1] http://www.postgresql.org/docs/current/interactive/sql-copy.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM