简体   繁体   中英

Node.js outbound http request concurrency

I've got a node.js script that pulls data from an external web API for local storage. The first request is a query that returns a list of IDs that I need to get further information on. For each ID returned, I spawn a new http request from node.js and reach out to the server for the data (POST request). Once the job is complete, I sleep for 3 minutes, and repeat. Sometimes the number of IDs is in the hundreds. Each individual http request for those returns maybe 1kb of data, usually less, so the round trip is very short.

I got an email this morning from the API provider begging me to shut off my process because I'm "occupying all of the API servers with hundreds of connections" (which I am actually pretty proud of, but that is not the point). To be nice, I increased the sleep from 3 minutes to 30 minutes, and that has so far helped them .

On to the question... now I've not set maxSockets or anything, so I believe the default is 5. Shouldn't that mean I can only create 5 live http request connections at a time? How does the admin have hundreds? Is their server not hanging up the connection once the data is delivered? Am I not doing so? I don't have an explicit disconnect at the end of my http request, so perhaps I am at fault here. So what does maxSockets actually set?

Sorry for some reason I didn't read your question correctly

maxSockets is the max number of connections the http module will make for that current process. You can check to see what yours is currently set at by accessing it from http.globalAgent.maxSockets .

You can see some information on the current number of connections you have to a given host with the following:

console.log("Active socket connections: %d", http.globalAgent.sockets['localhost:8080'].length )
console.log("Total queued requests: %d", http.globalAgent.requests['localhost:8080'].length)

Substituting the localhost:8080 for what ever host and port you are making the request too.

You can see how node handles these connections at the following two points:

Adding a new connection and storing to the request queue

https://github.com/joyent/node/blob/master/lib/_http_agent.js#L83

Creating connections from queued requests

https://github.com/joyent/node/blob/master/lib/_http_agent.js#L148


I wrote this up really quick to give you an idea how you could stagger those requests out a bit. This particular code doesn't check to see how many requests are "pending" you could easily modify it to allow you to only have a set number of requests going out at any given time (which honestly would be the better way to do it).

var Stagger = function (data, stagger, fn, cb) {

    var self        = this;
    this.timerID    = 0;
    this.data       = [].concat(data);
    this.fn         = fn;
    this.cb         = cb;
    this.stagger    = stagger;
    this.iteration  = 0;
    this.store      = {};

    this.start = function () {
        (function __stagger() {

            self.fn(self.iteration, self.data[self.iteration], self.store);

            self.iteration++;

            if (self.iteration != self.data.length)
                self.timerID = setTimeout(__stagger, self.stagger);
            else
                cb(self.store);

        })();
    };

    this.stop = function () {
        clearTimeout(self.timerID);

    };
};


var t = new Stagger([1,2,3,4,5,6], 1000, function (i, item, store) {
    console.log(i, item);
    if (!store.out) store.out = [];

    store.out[i] = Math.pow(2,i);
},
function (store) {
    console.log('Done!', store);
});

t.start();

This code can definitely could be improved but it should give you an idea of maybe where to start.

Live Demo: http://jsbin.com/ewoyik/1/edit (note: requires console)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM