简体   繁体   English

使用 node.js 向 mysql 插入多行

[英]Inserting many rows to mysql using node.js

I'm trying to import a large file into a database.我正在尝试将一个大文件导入数据库。 After many failed tries to import as csv file through mySQL, I decided to create a small node script that can read the file and insert the records one by one.在尝试通过 mySQL 导入为 csv 文件多次失败后,我决定创建一个小节点脚本,可以读取文件并逐条插入记录。

I've got about 10 differently formatted files of 80mb each.我有大约 10 个不同格式的文件,每个文件 80mb。 The current script is for one file that has an id on each line and nothing more (this particular table consists only of an id field and a status field), and this is its current state:当前脚本用于一个文件,每行都有一个 id,仅此而已(这个特定的表只包含一个 id 字段和一个状态字段),这是它的当前状态:

var mysql      = require('mysql');
var connection = mysql.createConnection({
    host     : 'hostname',
    user     : 'username',
    password : 'password',
    database : 'database'
});

var rl = require('readline').createInterface({
    input: require('fs').createReadStream('fileToRead.txt')
});

connection.connect();
rl.on('line', function (line) {
    var query = 'REPLACE INTO database.tablename(field1,field2) VALUES ("'+line+'",0);';
    connection.query(query, function(err) {
        if (err) {
            console.log("ERR:"+err);
            connection.end();
        }
    });
});

It works fine for about ten to twelve lines, and then throws the following mistake:它工作了大约十到十二行,然后抛出以下错误:

   <--- Last few GCs --->

   51338 ms: Scavenge 699.0 (738.6) -> 699.0 (738.6) MB, 8.7 / 0 ms (+ 15.0 ms i
n 1 steps since last GC) [allocation failure] [incremental marking delaying mark
-sweep].
   53709 ms: Mark-sweep 699.0 (738.6) -> 698.9 (738.6) MB, 2360.5 / 0 ms (+ 15.0
 ms in 2 steps since start of marking, biggest step 15.0 ms) [last resort gc].
   56065 ms: Mark-sweep 698.9 (738.6) -> 698.9 (738.6) MB, 2360.2 / 0 ms [last r
esort gc].



    <--- JS stacktrace --->
==== JS stack trace =========================================

Security context: 1DF25599 <JS Object>
    1: emit [events.js:~117] [pc=23C30364] (this=1027D245 <a Protocol with map 3
2339A39>,type=1DF4D5B1 <String[7]: enqueue>)
    2: arguments adaptor frame: 2->1
    3: _enqueue [path\node_modules\mysql\lib\protocol\
Protocol.js:~128] [pc=107BD3D8] (this=1027D245 <a Protocol with map 32339A39>,se
quence=157A3225 <a Query with map 3233C379>)
    4: /* anonymous */ [path...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

I'm not used to node, so I don't really know what that means.我不习惯节点,所以我真的不知道这意味着什么。 I believe it has to do with the query being called from inside a loop that moves faster than the query, but I'm not sure about it and wouldn't know how to handle it if that was the case.我相信这与从比查询移动得更快的循环内部调用的查询有关,但我不确定它,如果是这种情况,也不知道如何处理它。

Any help is appreciated.任何帮助表示赞赏。

Sorry if my english fails.对不起,如果我的英语不及格。

The problem is that you are ending the connection to the database after the first query ends, so, only the amount of querys that had enough time to go to the database should be executed and my guess is that only the first one will be inserted (maybe I am wrong here).问题是您在第一个查询结束后结束与数据库的连接,因此,只应执行有足够时间进入数据库的查询量,我的猜测是只会插入第一个(也许我在这里错了)。

Since you maybe are doing a " one time only script " of node, you could fix your problem just removing that line, something like this:由于您可能正在执行节点的“一次性脚本”,因此您只需删除该行即可解决问题,如下所示:

var mysql      = require('mysql');
var connection = mysql.createConnection({
    host     : 'hostname',
    user     : 'username',
    password : 'password',
    database : 'database'
});

var rl = require('readline').createInterface({
    input: require('fs').createReadStream('fileToRead.txt')
});

connection.connect();
rl.on('line', function (line) {
    var query = 'REPLACE INTO database.tablename(field1,field2) VALUES ("'+line+'",0);';
    connection.query(query, function(err) {
        if (err) {
            console.log("ERR:"+err);
            //I am not closing the connection anymore
        }
    });
});

If your scripts will be useful for many times (once a month, once a day or something like that) I will sugest a better solution, maybe using async and a pool of connections.如果您的脚本多次有用(一个月一次,一天一次或类似的情况),我会建议一个更好的解决方案,也许使用异步和连接池。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM