简体   繁体   English

如何从node.js快速将许多脚本生成的记录放入PostgreSQL中?

[英]How can I put many script-generated records in PostgreSQL quickly from node.js?

The following code, which inserts 200,000 records into PostgreSQL server from node.js, is taking about 17 minutes on my laptop PC, which feels horribly slow. 以下代码从node.js向PostgreSQL服务器中插入了200,000条记录,在我的笔记本电脑上花费了大约17分钟的时间,这真是太慢了。

var pg = require('pg');
var Client = pg.Client;
var async = require('async');

var client = new Client(connectionString);
client.connect();

var rollback = function(client) {
  client.query('ROLLBACK', function() {
    client.end();
    process.kill();
  });
};

client.query('BEGIN',function(err,result){
  if(err){ console.error(err); rollback(client);};
  async.waterfall([
    function(cb){
      client.query('DROP INDEX idx',function(err,result){
        client.query('TRUNCATE TABLE tbl',function(err,result){
          async.forEach(values,function(value,valueNext){
            client.query('INSERT INTO tbl ('
                         + 'col1,'
                         + 'col2) VALUES ($1,$2)',[
                           value,
                           generatedSomething(value)
                         ],function(err){
                           valueNext();
                         });
          },function(err,result){
            if(err){ console.error(err); rollback(client);cb(false);return;};
            client.query('CREATE INDEX idx ON tbl',function(err,result){
              cb(null);
            });
          });
        });
      });
    });
  },
],function(err){
  client.query('COMMIT', client.end.bind(client));
});

There are some strategies I've applied to speed up. 我采用了一些策略来加快速度。

  • Drop all indices before insertion, create it after all insertion is done ... ok 插入之前删除所有索引,在所有插入完成后创建它 ……好
  • Use TRUNCATE TABLE instead of DELETE FROM ... ok 使用TRUNCATE TABLE而不是DELETE FROM ... ok
  • Use COPY FROM instead of INSERT INTO ... not done 使用COPY FROM而不是INSERT INTO ...未完成

It seems that using COPY FROM instead of INSERT INTO will make effect, but it's used for imported CSV files, not for script-generated values. 似乎使用COPY FROM而不是INSERT INTO会生效,但是它用于导入的CSV文件,而不是用于脚本生成的值。

So, does it mean that exporting script-generated values to temporary CSV file, and importing values using COPY FROM , is the most effictive way to insert values into PostgreSQL quickly? 那么,这是否意味着将脚本生成的值导出到临时CSV文件,并使用COPY FROM导入值是将值快速插入PostgreSQL的最有效方法?

copyFrom will return a WritableStream that you can append your values to as CSV, like: copyFrom将返回一个WritableStream ,您可以将其值附加为CSV,例如:

var stream = client.copyFrom("COPY tbl (col1, col2) FROM STDIN WITH CSV");
stream.on('close', function() {
  client.query("COMMIT");
});
stream.on('error', rollback);
async.forEach(values, function(value, valueNext) {
  stream.write(value + "," + generatedSomething(value) + "\n");
});
stream.end();

Of course you will need to properly escape your values 当然,您将需要适当地逃避自己的价值观

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM