简体   繁体   中英

How to handle async Node.js in a loop

I have such a loop :

var i,j,temparray,chunk = 200;
for (i=0,j=document.mainarray.length; i<j; i+=chunk) {
  temparray = document.mainarray.slice(i,i+chunk);

  var docs =  collection.find({ id: { "$in": temparray}}).toArray();

  docs.then(function(singleDoc)
  {
    if(singleDoc)
    {
      console.log("single doc length : " + singleDoc.length);
      var t;
      for(t = 0, len = singleDoc.length; t < len;t++)
      {
        fs.appendFile("C:/Users/x/Desktop/names.txt", singleDoc[t].name + "\n", function(err) {
          if(err) {
            return console.log(err);
          }
        });
      }
    }
  });
}

The loop iterates for two times. In first iteration it gets 200 elements, in second, it gets 130 elements. And when I open the .txt file, I see only 130 names. I guess because of the async nature of Node.js, only second part of the array is processed. What should I do to get all parts of the array to be processed? Thanks in advance.

EDIT : I finally turned the code to this :

var generalArr = [];
var i,j,temparray,chunk = 200;
for (i=0,j=document.mainarray.length; i<j; i+=chunk) {
    temparray = document.mainarray.slice(i,i+chunk);

generalArr.push(temparray);


} 

async.each(generalArr, function(item, callback)
{

  var docs =  collection.find({ id: { "$in": item}}).toArray();

   docs.then(function(singleDoc)
  {
    if(singleDoc)
    {
      console.log("single doc length : " + singleDoc.length);
              var t;
        for(t = 0, len = singleDoc.length; t < len;t++)
        {    
           fs.appendFile("C:/Users/x/Desktop/names.txt", singleDoc[t].name + "\n", function(err) {
          if(err) {
          return console.log(err);
          }
        });
        }
    }


  });

  callback(null);
})

When I change this line :

var docs =  collection.find({ id: { "$in": item}}).toArray();

To this line :

var docs =  collection.find({ id: { "$in": item}}).project({ name: 1 }).toArray();

It works, I'm able to print all names. I guess there is a problem with memory when I try without .project() . How can I make this work without using project? Should I change some memory limits? Thanks in advance.

I think your code is unnecessary complicated and appending file in a loop is very expensive when compared to in-memory computation. A better way would be to write to file just once.

var i, j, temparray, chunk = 200;
for (i = 0, j = document.mainarray.length; i < j; i += chunk) {
  temparray = document.mainarray.slice(i, i + chunk);
  generalArr.push(temparray);
}
const queryPromises = [];
generalArr.forEach((item, index) => {
  queryPromises.push(collection.find({ id: { "$in": item } }).toArray());
});
let stringToWrite = '';
Promise.all(queryPromises).then((result) => {
  result.forEach((item) => {
    item.forEach((element) => {
      //create a single string which you want to write
      stringToWrite = stringToWrite + "\n" + element.name;
    });
  });
  fs.appendFile("C:/Users/x/Desktop/names.txt", stringToWrite, function (err) {
    if (err) {
      return console.log(err);
    } else {
      // call your callback or return
    }
  });
});

In the code above, I do the following.

  1. Wait for all the db queries to finish
  2. Lets iterate over this list and create one string that we need to write to the file
  3. Write to the file

Once you go asynchronous you cannot go back - all your code needs to be asynchronous. In node 8 you handle this with async and await keywords. In older versions you can use Promise - async / await are just syntax sugar for it anyway.

However, most of the API in node are older than Promise , and so they use callbacks instead. There is a promisify function to update callback functions to promises.

There are two ways to handle this, you can let all the asynchronous actions happen at the same time, or you can chain them one after another (which preserves order but takes longer).

So, collection.find is asynchronous, it either takes a callback function or returns a Promise . I'm going to assume that the API you're using does the latter, but your problem could be the former (in which case look up promisify ).

var findPromise =  collection.find({ id: { "$in": item}});

Now, at this point findPromise holds the running find action. We say this is a promise that resolves (completes successfully) or rejects (throws an error). We want to queue up an action to do once it completes, and we do that with then :

// The result of collection.find is the collection of matches
findPromise.then(function(docs) {
    // Any code we run here happens asynchronously
});

// Code here will run first

Inside the promise we can return further promises (allowing them to be chained - complete one async, then complete the next, then fire the final resolve once all done) or use Promise.all to let them all happen in parallel and resolve once done:

var p = new Promise(function(resolve, reject) {
    var findPromise =  collection.find({ id: { "$in": item}});
    findPromise.then(function(docs) {
        var singleDocNames = [];
        for(var i = 0; i < docs.length; i++) {
            var singleDoc = docs[i];
            if(!singleDoc)
                 continue;

            for(var t = 0; t < singleDoc.length; t++)
                singleDocNames.push(singleDoc[t].name);
        }

        // Resolve the outer promise with the final result
        resolve(singleDocNames);
    });
});

// When the promise finishes log it to the console
p.then(console.log);

// Code inline here will fire before the promise

This is much easier in node 8 with async / await :

async function p() {
    // Await puts the rest of this function in the .then() of the promise
    const docs = await collection.find({ id: { "$in": item}});

    const singleDocNames = [];
    for(var i = 0; i < docs.length; i++) {
        // ... synchronous code unchanged ...
    }

    // Resolve the outer promise with the final result
    return singleDocNames;
});

// async functions can be treated like promises
p().then(console.log);

If you need to write the results to a text file asynchronously there are a couple of ways to do it - you can wait until the end and write all of them, or chain a promise to write them after each find, though I find parallel IO operations tend to be at more risk of deadlocks.

Code above have multiple issues about asynchronous control flow. Similar code possible can exists, but only if case of using ES7 async/await operators on all async operation.

Of course, you can easily achieve solution by promises sequence. Solution:

let flowPromise = Promise.resolve();

const chunk = 200;
for (let i=0,j=document.mainarray.length; i<j; i+=chunk) {
    flowPromise = flowPromise.then(() => {
        const temparray = document.mainarray.slice(i,i+chunk);
        const docs =  collection.find({ id: { "$in": temparray}}).toArray();
        return docs.then((singleDoc) => {
            let innerFlowPromise = Promise.resolve();
            if(singleDoc) {
                console.log("single doc length : " + singleDoc.length);
                for(let t = 0, len = singleDoc.length; t < len;t++) {
                    innerFlowPromise = innerFlowPromise.then(() => new Promise((resolve, reject) =>
                        fs.appendFile(
                            "C:/Users/x/Desktop/names.txt", singleDoc[t].name + "\n",
                            err => (err ? reject(err) : resolve())
                        )
                    ));
                }
            }
            return innerFlowPromise;
        }
    });
}

flowPromise.then(() => {
    console.log('Done');
}).catch((err) => {
    console.log('Error: ', err);
})

When use async-like control flow, based on Promises, always remember that every loop and function call sequence will not pause execution till async operation be done, so include all then sequences manually. Or use async/await syntax.

Which version of nodejs are you using? You should use the native async/await support which is built into newer versions nodejs (no libraries required). Also note, fs.appendFile is asyncronous so you need to either use a library like promisify to transform the callback into a promise or just use the appendFileSync and suffer the blocking IO (but might be okay for you, depending on the use case.)

async function(){
    ...
    for(var item of generalArr) {
      var singleDoc = await collection.find({ id: { "$in": item}}).toArray();
      // if(singleDoc) { this won't do anything, since collection.find will always return something even if its just an empty array
      console.log("single doc length : " + singleDoc.length);
      var t;
      for(t = 0, len = singleDoc.length; t < len;t++){    
          fs.appendFileSync("C:/Users/x/Desktop/names.txt", singleDoc[t].name + "\n");
       }

    };
}    
var docs =  collection.find({ id: { "$in": document.mainarray}}), // returns a cursor
  doc,
  names = [],
  toInsert;

function saveToFile(cb) {
  toInsert = names.splice(0,100);
  if(!toInsert.length) return cb();
  fs.appendFile("C:/Users/x/Desktop/names.txt", toInsert.join("\n"), cb);
}

(function process() {
  if(docs.hasNext()) {
    doc = docs.next();

    doc.forEach(function(d) {
      names.push(d.name);
    });

    if(names.length === 100) {
      // save when we have 100 names in memory and clear the memory
      saveToFile(function(err) {
        process();
      });
    } else {
       process();
    }
  } else {
    saveToFile(function(){
      console.log('All done');
    });
  }
}()); // invoke the function

If you can't solve your issue using core modules and basic nodejs, there is most likely a lack of understanding of how things work or insufficient knowledge about a library (in this case FileSystem module).

Here is how you can solve your issue, without 3th party libraries and such.

'use strict';

const
    fs = require('fs');

let chunk = 200;

// How many rounds of array chunking we expect 
let rounds = Math.ceil(mainArray.length/chunk);
// copy to temp (for the counter)
let tempRounds = rounds;
// set file name
let filePath = './names.txt'


// Open writable Stream
let myFileStream = fs.createWriteStream(filePath);


// from round: 0-${rounds}
for (let i = 0; i < rounds; i++) {
    // assume array has ${chunk} elements left in this round
    let tempChunk = chunk;
    // if ${chunk} is to big i.e. i=3 -> chunk = 600 , but mainArray.length = 512
    // This way we adjust the last round for "the leftovers"
    if (mainArray.length < i*chunk) tempChunk = Math.abs(mainArray.length - i*chunk);
    // slice it for this round
    let tempArray = mainArray.slice(i*chunk, i*chunk + tempChunk);
    // get stuff from DB
    let docs =  collection.find({ id: { "$in": tempArray}}).toArray();

    docs.then(function(singleDoc){
        // for each name in the doc
        for (let j = 0; j < singleDoc.length; j++) {
            // write to stream
            myFileStream.write(singleDoc[t].name + "\n");
        }
        // declare round done (reduce tempRounds) and check if it hits 0
        if (!--tempRounds) {
            // if all rounds are done, end the stream
            myFileStream.end();
            // BAM! you done
            console.log("Done")
        }
    });
}

The key is to use fs.WritableStreams :) link here to docs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM