NodeJS async.waterfall — Untold facts
When you need server process which resembles an assembly line in a factory !
Preamble — async module has many useful functions almost like logic gates — parallel, series, queue, priority queue. Here we are going to look into the most used and discussed of them all the async.waterfall.
Of course there are lots of beautiful explanations and simple examples on how to code a async.waterfall model. But there are 3 very vital notes which almost none of them made a mention.
First lets see where and how this waterfall will come in handy
Conventional waterfall model
Lets say we need to do something like this, each of this has the input from the prior step and passes on processed details to the next one.
Pseudo Code
1. Get employee details from DB
2. Calculate his salary based on his package and leave details
3. Do a bank transfer
4. Notify employee by email with pay slip
Nodejs gives callbacks as a mechanism to mark the completion of a particular function, the status — error or good and then anything need to be passed back to the calling function.
But when we have a handful of process to be done in sequence the callbacks are nested one inside other like this and most people refer it as callback hell.
Nested callback
getEmployeedetails(args, function()
{ calcSalary(args, function()
{ backTransfer(args, function()
{ notifyEmployee…
})
})
})
})
Nevertheless this is just outline of the functions and in reality we will have umpteen lines in each tracking error and later managing the code becomes cumbersome.
Manageable with async.waterfall
The code pseudo code is now mapped onto to functions inside the async array. Note the functions are actullay inside the [] array, each passing arguments or parameters to the next one. The “done” in each of the steps is the how I hvae named callback function.
var async = require('async');
/*-----------start of async call ------------*/
async.waterfall([
function (callback){
callback(null, 'Get details is step 1');
},
function (startparam, done) {
console.log('Sal param - '+ startparam);
done(null, 'after salary from step 2'); // <- set value to passed to step 3
},
function (step1Result, done) {
console.log(step1Result);
done(null, 'after transfer from step 3'); // <- set value to passed to step 4
},
function (step2Result, done) {
console.log(step2Result);
done(null, 'mail sent - last step');
}
],
function (err) { /*-----------final callback from async call ------------*/
if (err) {
console.log ('error');
} else {
console.log('No error happened in any steps, operation done!');
}
});
/*-----------end of async call ------------*/
Now we have achieved async.waterfall flow model, we will get into the 3 points I had pondered.
Named functions
The control flow has improved for sure, but still it is lengthy for someone who comes in for maintenance !
To make them more readable move the functions outside of the control flow with beautiful names on them. What is to note importantly is
Can’t explicitly pass input parameter and callbacks in the function calls of named function. Program will error out at loader.js. Just mention the coma separated list of functions in async array, the in parameters and out callback returns are implicit.
var async = require('async');
/*-----------start of async call ------------*/
async.waterfall([
getEmployeedetails,
calcSalary,
backTransfer,
notifyEmployee
],
function (err) { /*----final callback from async call --------*/
if (err) {
console.log ('error');
} else {
console.log('No error happened in any steps, operation done!');
}
});
/*-----------end of async call ------------*//*-----------step functions inside call ------------*/
function getEmployeedetails(callback){
callback(null, 'Get details is step 1');
}
function calcSalary(startparam, done) {
console.log('Sal param - '+ startparam);
done(null, 'after salary from step 2'); // <- set value to passed to step 2
}
function backTransfer(step1Result, done) {
console.log(step1Result);
done(null, 'after transfer from step 3'); // <- set value to passed to step 3
}
function notifyEmployee (step2Result, done) {
console.log(step2Result);
done(null, 'mail sent - last step');
}
In parameter to first function in stack
We assumed we will have employee id as input to the control async and we will proceed inside the stack. Now what i discovered was there is no way to pass parameter to the first function, it doesn't take input parameters as seen above.
The only way to pass input parameter to the kick of f the async series is to have a dummy set up function as first call inside the async series and then continue with the rest. This set up function will do a dummy callback and which can be the input parameter for the start function.
allset is the dummy callback supplying the parameter to the actual first function getEmployeedetails in the stack. Ideally we might have gotten the empId from a http.post request.
var async = require('async');
/*-----------start of async call ------------*/
var empId = 1014;
async.waterfall([
function setParam (allset){
allset(null, empId);
},
getEmployeedetails,
calcSalary,
backTransfer,
notifyEmployee
],
function (err) { /*----final callback from async call --------*/
if (err) {
console.log ('error');
} else {
console.log('No error happened in any steps, operation done!');
}
});
/*-----------end of async call ------------*/
/*-----------step functions inside call ------------*/
function getEmployeedetails(dummysetup, callback){
console.log ('From setParam - ' + dummysetup);
callback(null, 'Get details is step 1');
}
function calcSalary(startparam, done) {
console.log('sal param - '+ startparam);
done(null, 'after salary from step 2'); // <- set value to passed to step 2
}
function backTransfer(step2Result, done) {
console.log(step2Result);
done(null, 'after transfer from step 3'); // <- set value to passed to step 3
}
function notifyEmployee (step3Result, done) {
console.log(step3Result);
done(null, 'mail sent - last step');
}
Meaningful callback names
We need not actually name it as “callback” as we generally do. Callback is not a keyword, only a concept meaning, function is finished and ready to get back that's all.
Each of the functions can have callbacks with different names too say dbdone, salarydone which will make it much more perceivable.
var async = require('async');
/*-----------start of async call ------------*/
var empId = 1014;
async.waterfall([
function setParam (allset){
allset(null, empId);
},
getEmployeedetails,
calcSalary,
backTransfer,
notifyEmployee
],
function (err) { /*--final callback from async call ------*/
if (err) {
console.log ('error');
} else {
console.log('No error happened in any steps, operation done!');
}
});
/*-----------end of async call ------------*//*-----------step functions inside call ------------*/
function getEmployeedetails(dummysetup, getDone){
console.log ('From setParam - ' + dummysetup);
getDone(null, 'Get details is step 1');
}
function calcSalary(startparam, saldone) {
console.log('sal param - '+ startparam);
saldone(null, 'after salary from step 2');
}
function backTransfer(step2Result, xfrdone) {
console.log(step2Result);
xfrdone(null, 'after transfer from step 3');
}
function notifyEmployee (step3Result, notifydone) {
console.log(step3Result);
notifydone(null, 'mail sent - last step');
}
That’s it to enjoy the full benefits offered by async.waterfall module.
Kindly note the whole series can also be looped
- To work over say an array of Employee
- To error out and stop processing for a particular employee and continue with the rest.
- Report a list of errored and successful process out of an array
Maybe someday will document that too !
Hope this made life easier for a learner !
Patience signing off for now !