简体   繁体   中英

How to download whole collection with more than 500k records as csv with nodejs from mongodb?

I have tried this with npm package called json2csv. It is working fine for records up to 75 000. when the data is more than that i am not getting any response from the callback function exporttocsv as given below.

    const json2csv = require('json2csv').parse;
    var today = new Date();
var mongoClient = require('mongodb').MongoClient
, assert = require('assert');
    var dd = today.getDate();
    var mm = today.getMonth() + 1; //January is 0!
    var yyyy = today.getFullYear();
    if (dd < 10) {
      dd = '0' + dd;
    } 
    if (mm < 10) {
      mm = '0' + mm;
    } 
    var today = dd + '_' + mm + '_' + yyyy;



    router.put('/mass_report', (req, res) => {

        mass_data_download();
        res.json("Mass report download initiated");

    });

    function exporttocsv(data,name, callback) {
        /* Start: Json to xlsx conversion */
        if (!fs.existsSync('./csv/'+today+'/')) {
            fs.mkdirSync('./csv/'+today+'/');
        }

        var csv = json2csv(data);

        var fname = './csv/'+today+'/' +name+ new Date().getTime() + '.csv';
        fs.writeFileSync(fname, csv, 'binary',(error,response)=>{
            console.log(error);
            console.log(response);
        });
        callback(fname);

    }

    function mass_data_download(){


        db.collection('mass_data').aggregate([
            {$match:{
                created_on: {
                    $gte: new Date("2017-09-01T00:00:00.000Z"),
                }
            }}

        ]).sort({_id:-1}).toArray( function (error, response) {
        if(error){
            console.log(error)
        }
        else{
            console.log(response.length);
            exporttocsv(response,'mass_report', function (fname) {

                console.log('reports download complted');



            })

        }

            })
    }

is there any limitations while exporting data to csv? or how to achieve this with any other alternatives?

The thing is you are handling huge amount of data in memory at the same time. You should avoid it at all costs. Node.js is perfect for using streams, piggyback on it. Consider Mongo as your readable stream then pipe it to json2csv transform stream and do what you want with the result, perhaps you want to pipe it to writable stream such as file or even http response.

Mongoose supports streaming. More information you can find here json2csv also supports streaming interface. here is more info about streaming API of json2csv.

UPDATED : final pseudocode should look like:

const csv = fs.createWriteStream('file.csv');

Model.find()
    .cursor()  // read more [here][1] 
    .pipe(json2csvTransformStream) // read more in json2csv transform stream API
    .pipe(csv); // read more in fs.createWritableStream

Piping will handle all stream flow and you will not be worried about memory leaks or performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM