简体   繁体   English

如何从mongodb中使用nodejs以csv格式下载超过50万条记录的整个集合?

[英]How to download whole collection with more than 500k records as csv with nodejs from mongodb?

I have tried this with npm package called json2csv. 我已经使用名为json2csv的npm包尝试过此操作。 It is working fine for records up to 75 000. when the data is more than that i am not getting any response from the callback function exporttocsv as given below. 对于多达75,000条的记录,它工作正常。当数据超过该值时,我没有从回调函数exporttocsv获得任何响应,如下所示。

    const json2csv = require('json2csv').parse;
    var today = new Date();
var mongoClient = require('mongodb').MongoClient
, assert = require('assert');
    var dd = today.getDate();
    var mm = today.getMonth() + 1; //January is 0!
    var yyyy = today.getFullYear();
    if (dd < 10) {
      dd = '0' + dd;
    } 
    if (mm < 10) {
      mm = '0' + mm;
    } 
    var today = dd + '_' + mm + '_' + yyyy;



    router.put('/mass_report', (req, res) => {

        mass_data_download();
        res.json("Mass report download initiated");

    });

    function exporttocsv(data,name, callback) {
        /* Start: Json to xlsx conversion */
        if (!fs.existsSync('./csv/'+today+'/')) {
            fs.mkdirSync('./csv/'+today+'/');
        }

        var csv = json2csv(data);

        var fname = './csv/'+today+'/' +name+ new Date().getTime() + '.csv';
        fs.writeFileSync(fname, csv, 'binary',(error,response)=>{
            console.log(error);
            console.log(response);
        });
        callback(fname);

    }

    function mass_data_download(){


        db.collection('mass_data').aggregate([
            {$match:{
                created_on: {
                    $gte: new Date("2017-09-01T00:00:00.000Z"),
                }
            }}

        ]).sort({_id:-1}).toArray( function (error, response) {
        if(error){
            console.log(error)
        }
        else{
            console.log(response.length);
            exporttocsv(response,'mass_report', function (fname) {

                console.log('reports download complted');



            })

        }

            })
    }

is there any limitations while exporting data to csv? 将数据导出到CSV时有什么限制吗? or how to achieve this with any other alternatives? 或如何通过其他替代方式实现这一目标?

The thing is you are handling huge amount of data in memory at the same time. 问题是您正在同时处理内存中的大量数据。 You should avoid it at all costs. 您应该不惜一切代价避免使用它。 Node.js is perfect for using streams, piggyback on it. Node.js非常适合使用流,搭载在其上。 Consider Mongo as your readable stream then pipe it to json2csv transform stream and do what you want with the result, perhaps you want to pipe it to writable stream such as file or even http response. Mongo视为您的可读流,然后将其通过管道传输到json2csv转换流,然后对结果进行所需的操作,也许您希望将其通过管道传输到可写流,例如文件甚至http响应。

Mongoose supports streaming. 猫鼬支持流媒体。 More information you can find here json2csv also supports streaming interface. 您可以在这里找到更多信息json2csv还支持流接口。 here is more info about streaming API of json2csv. 这里是有关json2csv流API的更多信息。

UPDATED : final pseudocode should look like: UPDATED :最终伪代码应类似于:

const csv = fs.createWriteStream('file.csv');

Model.find()
    .cursor()  // read more [here][1] 
    .pipe(json2csvTransformStream) // read more in json2csv transform stream API
    .pipe(csv); // read more in fs.createWritableStream

Piping will handle all stream flow and you will not be worried about memory leaks or performance. 管道将处理所有流,并且您不必担心内存泄漏或性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM