Using Async waterfall in node.js

Question

I have 2 functions that I'm running asynchronously. I'd like to write them using waterfall model. The thing is, I don't know how..

Here is my code :

var fs = require('fs');
function updateJson(ticker, value) {
  //var stocksJson = JSON.parse(fs.readFileSync("stocktest.json"));
  fs.readFile('stocktest.json', function(error, file) {
    var stocksJson =  JSON.parse(file);

    if (stocksJson[ticker]!=null) {
      console.log(ticker+" price : " + stocksJson[ticker].price);
      console.log("changing the value...")
      stocksJson[ticker].price =  value;

      console.log("Price after the change has been made -- " + stocksJson[ticker].price);
      console.log("printing the the Json.stringify")
      console.log(JSON.stringify(stocksJson, null, 4));
      fs.writeFile('stocktest.json', JSON.stringify(stocksJson, null, 4), function(err) {  
        if(!err) {
          console.log("File successfully written");
        }
        if (err) {
          console.error(err);
        }
      }); //end of writeFile
    } else {
      console.log(ticker + " doesn't exist on the json");
    }
  });
} // end of updateJson

Any idea how can I write it using waterfall, so i'll be able to control this? Please write me some examples because I'm new to node.js

Answer 1

First identify the steps and write them as asynchronous functions (taking a callback argument)

read the file

 function readFile(readFileCallback) { fs.readFile('stocktest.json', function (error, file) { if (error) { readFileCallback(error); } else { readFileCallback(null, file); } }); }

process the file (I removed most of the console.log in the examples)

 function processFile(file, processFileCallback) { var stocksJson = JSON.parse(file); if (stocksJson[ticker] != null) { stocksJson[ticker].price = value; fs.writeFile('stocktest.json', JSON.stringify(stocksJson, null, 4), function (error) { if (err) { processFileCallback(error); } else { console.log("File successfully written"); processFileCallback(null); } }); } else { console.log(ticker + " doesn't exist on the json"); processFileCallback(null); //callback should always be called once (and only one time) } }

Note that I did no specific error handling here, I'll take benefit of async.waterfall to centralize error handling at the same place.

Also be careful that if you have (if/else/switch/...) branches in an asynchronous function, it always call the callback one (and only one) time.

Plug everything with async.waterfall

async.waterfall([
    readFile,
    processFile
], function (error) {
    if (error) {
        //handle readFile error or processFile error here
    }
});

Clean example

The previous code was excessively verbose to make the explanations clearer. Here is a full cleaned example:

async.waterfall([
    function readFile(readFileCallback) {
        fs.readFile('stocktest.json', readFileCallback);
    },
    function processFile(file, processFileCallback) {
        var stocksJson = JSON.parse(file);
        if (stocksJson[ticker] != null) {
            stocksJson[ticker].price = value;
            fs.writeFile('stocktest.json', JSON.stringify(stocksJson, null, 4), function (error) {
                if (!err) {
                    console.log("File successfully written");
                }
                processFileCallback(err);
            });
        }
        else {
            console.log(ticker + " doesn't exist on the json");
            processFileCallback(null);
        }
    }
], function (error) {
    if (error) {
        //handle readFile error or processFile error here
    }
});

I left the function names because it helps readability and helps debugging with tools like chrome debugger.

If you use underscore ( on npm ), you can also replace the first function with _.partial(fs.readFile, 'stocktest.json')

Answer 2

First and foremost, make sure you read the documentation regarding async.waterfall .

Now, there are couple key parts about the waterfall control flow:

The control flow is specified by an array of functions for invocation as the first argument, and a "complete" callback when the flow is finished as the second argument.
The array of functions are invoked in series (as opposed to parallel).
If an error (usually named err ) is encountered at any operation in the flow array, it will short-circuit and immediately invoke the "complete"/"finish"/"done" callback .
Arguments from the previously executed function are applied to the next function in the control flow, in order, and an "intermediate" callback is supplied as the last argument. Note: The first function only has this "intermediate" callback, and the "complete" callback will have the arguments of the last invoked function in the control flow (with consideration to any errors) but with an err argument prepended instead of an "intermediate" callback that is appended.
The callbacks for each individual operation (I call this cbAsync in my examples) should be invoked when you're ready to move on: The first parameter will be an error, if any, and the second (third, fourth... etc.) parameter will be any data you want to pass to the subsequent operation.

The first goal is to get your code working almost verbatim alongside the introduction of async.waterfall . I decided to remove all your console.log statements and simplified your error handling. Here is the first iteration ( untested code ):

var fs = require('fs'),
    async = require('async');

function updateJson(ticker,value) {
    async.waterfall([ // the series operation list of `async.waterfall`
        // waterfall operation 1, invoke cbAsync when done
        function getTicker(cbAsync) {
            fs.readFile('stocktest.json',function(err,file) {
                if ( err ) {
                    // if there was an error, let async know and bail
                    cbAsync(err);
                    return; // bail
                }
                var stocksJson = JSON.parse(file);
                if ( stocksJson[ticker] === null ) {
                    // if we don't have the ticker, let "complete" know and bail
                    cbAsync(new Error('Missing ticker property in JSON.'));
                    return; // bail
                }
                stocksJson[ticker] = value;
                // err = null (no error), jsonString = JSON.stringify(...)
                cbAsync(null,JSON.stringify(stocksJson,null,4));    
            });
        },
        function writeTicker(jsonString,cbAsync) {
            fs.writeFile('stocktest.json',jsonString,function(err) {
                cbAsync(err); // err will be null if the operation was successful
            });
        }
    ],function asyncComplete(err) { // the "complete" callback of `async.waterfall`
        if ( err ) { // there was an error with either `getTicker` or `writeTicker`
            console.warn('Error updating stock ticker JSON.',err);
        } else {
            console.info('Successfully completed operation.');
        }
    });
}

The second iteration divides up the operation flow a bit more. It puts it into smaller single-operation oriented chunks of code. I'm not going to comment it, it speaks for itself ( again, untested ):

var fs = require('fs'),
    async = require('async');

function updateJson(ticker,value,callback) { // introduced a main callback
    var stockTestFile = 'stocktest.json';
    async.waterfall([
        function getTicker(cbAsync) {
            fs.readFile(stockTestFile,function(err,file) {
                cbAsync(err,file);
            });
        },
        function parseAndPrepareStockTicker(file,cbAsync) {
            var stocksJson = JSON.parse(file);
            if ( stocksJson[ticker] === null ) {
                cbAsync(new Error('Missing ticker property in JSON.'));
                return;
            }
            stocksJson[ticker] = value;
            cbAsync(null,JSON.stringify(stocksJson,null,4));
        },
        function writeTicker(jsonString,cbAsync) {
            fs.writeFile('stocktest.json',jsonString,,function(err) {
                cbAsync(err);
            });
        }
    ],function asyncComplete(err) {
        if ( err ) {
            console.warn('Error updating stock ticker JSON.',err);
        }
        callback(err);
    });
}

The last iteration short-hands a lot of this with the use of some bind tricks to decrease the call stack and increase readability (IMO), also untested:

var fs = require('fs'),
    async = require('async');

function updateJson(ticker,value,callback) {
    var stockTestFile = 'stocktest.json';
    async.waterfall([
        fs.readFile.bind(fs,stockTestFile),
        function parseStockTicker(file,cbAsync) {
            var stocksJson = JSON.parse(file);
            if ( stocksJson[ticker] === null ) {
                cbAsync(new Error('Missing ticker property in JSON.'));
                return;
            }
            cbAsync(null,stocksJson);
        },
        function prepareStockTicker(stocksJson,cbAsync) {
            stocksJson[ticker] = value;
            cbAsync(null,JSON.stringify(stocksJson,null,4));
        },
        fs.writeFile.bind(fs,stockTestFile)
    ],function asyncComplete(err) {
        if ( err ) {
            console.warn('Error updating stock ticker JSON.',err);
        }
        callback(err);
    });
}

Answer 3

Basically nodejs (and more generally javascript) functions that require some time to execute (be it for I/O or cpu processing) are typically asynchronous, so the event loop (to make it simple is a loop that continuously checks for tasks to be executed) can invoke the function right below the first one, without getting blocked for a response. If you are familiar with other languages like C or Java, you can think an asynchronous function as a function that runs on another thread (it's not necessarily true in javascript, but the programmer shouldn't care about it) and when the execution terminates this thread notifies the main one (the event loop one) that the job is done and it has the results.

As said once the first function has ended its job it must be able to notify that its job is finished and it does so invoking the callback function you pass to it. to make an example:

var callback = function(data,err)
{
   if(!err)
   {
     do something with the received data
   }
   else
     something went wrong
}


asyncFunction1(someparams, callback);

asyncFunction2(someotherparams);

the execution flow would call: asyncFunction1, asyncFunction2 and every function below until asyncFunction1 ends, then the callback function which is passed as the last parameter to asyncFunction1 is called to do something with data if no errors occurred.

So, to make 2 or more asynchronous functions execute one after another only when they ended you have to call them inside their callback functions:

function asyncTask1(data, function(result1, err)
{
   if(!err)
     asyncTask2(data, function(result2, err2)
     {
           if(!err2)
        //call maybe a third async function
           else
             console.log(err2);
     });
    else
     console.log(err);
});

result1 is the return value from asyncTask1 and result2 is the return value for asyncTask2. You can this way nest how many asynchronous functions you want.

In your case if you want another function to be called after updateJson() you must call it after this line:

console.log("File successfully written");

Using Async waterfall in node.js

Question

3 answers

solution1
59 ACCPTED 2014-09-06 22:24:37

First identify the steps and write them as asynchronous functions (taking a callback argument)

Plug everything with async.waterfall

Clean example

solution2
13 2014-09-06 22:27:12

solution3
2 2014-09-06 22:41:21

Using Async waterfall in node.js

Question

3 answers

solution1 59 ACCPTED 2014-09-06 22:24:37

First identify the steps and write them as asynchronous functions (taking a callback argument)

Plug everything with async.waterfall

Clean example

solution2 13 2014-09-06 22:27:12

solution3 2 2014-09-06 22:41:21

solution1
59 ACCPTED 2014-09-06 22:24:37

solution2
13 2014-09-06 22:27:12

solution3
2 2014-09-06 22:41:21