简体   繁体   中英

High Latency with NodeJS

This problem pertains specifically to Nodejitsu, but similar effects seem to happen on other VPSes. I have a real time game using socket.io, and one thing I've noticed is that occasionally the server will wait an inordinate amount of time before responding. If multiple requests are sent during that timeframe, they behave as if they've all been queued up and processed at once. I suspect it's vaguely correlated to the presence of other users on the hardware shared (as is the case with any VPS).

Anyway, to test this out (and make sure that it wasn't due to my game's code), I built a minimal test case:

express = require('express')
http = require('http')

app = express()
server = http.Server(app)

io = require('socket.io').listen(server)

io.sockets.on('connection', function(sock){
    sock.on('perf', function(data, cb){
        cb([Date.now()]); //respond with the current time
    })
})

app.get('/', function(req, res){
    res.header("Access-Control-Allow-Origin", "*")
    res.header("Access-Control-Allow-Methods", "HEAD,GET,PUT,POST,DELETE")
    res.header("Access-Control-Allow-Headers", "X-Requested-With")

    res.end(JSON.stringify([Date.now().toString()])); //http equivalent of perf function
})

server.listen(process.env.PORT || 6655, function(){
    console.log('listening now')
})

I had a simple blank HTML page with socket.io which would periodically send a perf event and time how long it took for the callback to fire. And it still shows the same thing:

显示滞后峰值的图形

Note that the bar length represents the square root of the amount of time, not the linear quantity.

When instead of relying on socket.io, I use XHR to do a similar measurement of the current response time, the result is pretty similar, a lot of low latency responses (though with a higher baseline than websockets, as expected) and some occasional spikes that appear to pile up.

The odd thing is that if you open it up in multiple browser windows and different browsers, there seems to be a correlation between the different browsers (and the fact that it's totally absent or significantly less frequent on some servers) which seems to imply that it's a server side phenomenon. However, there are latency spikes that happen for some browsers but not others, and the two Chrome windows which are of the same session appear to be virtually exact duplicates, which suggests that it's something that happens locally (per computer, or per browser, networking wise).

From Left to Right: Chrome Incognito, Chrome (regular), Firefox, Chrome (regular)

四个窗口上的图表

Anyway, this has been confusing me for months and I'd really like to understand what is causing it and how to fix it.

I assume you checked if you have a cpu or ram issue.

The only thing that can slow down node in a "surprising" way is the garbage collector - try to run your node with the --trace* to see what is going on. (See node --v8-options .)

I personally assue that you don't find out anything from that, because - and thats just my feeling - the issue is somewhere else.

With that perfect delay of a multiply of 500ms I assume you have a packet loss. You can check with ifconfig if that is a general issue and then tcpdump the packets and see if they retransmit.

I know this may sound strange but have you consider it is not an issue with node but with the OS setting. Have you checked your file handles and the number of connections the OS is showing to the socket? Have you also made sure the socket timeout in the OS is low enough? I have run into similar sounding performance issues with other code and it turned out to be the OS and not the code. Also check the package and see what it has for open allowed connections on the socket. I ave not looked at the node code but ran into a similar issue with the http client library in java. The application just backed up and it was just a configuration issue with number of connections.

The reason why you see this is because of Nagle's Algorithm. It's an algorithm used on I/O that buffers data for a while and then sends bigger chunks of data. It is used to save you transmissions (in sockets). You can read more about it here http://en.wikipedia.org/wiki/Nagle's_algorithm

To disable Nagle's algorithm (good when you want to send lots of small requests as fast as possible) you can do socket.setNoDelay(true); if you're using net.Socket() . In the case of socket.io I believe Nagle is already disabled by default for Websockets but not necessarily for other protocols. I would recommend running a test with net.Sockets from node.js, disable Nagle and see what you get.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM