简体   繁体   中英

Does Node.js or express itself block event loop?

I have very simple Node.js 8.5.0 express 4.15.5 server with cluster module, serving static files. The problem is that it seems that event loop is blocked at times for too long time. I'm using blocked module and also keep track how many requests have been handled since last check, using timeout interval 70ms. Many times the counter is just zero: event loop is blocked sometimes for second(s) while there were no requests.

Log:

Execution blocked for 1056 ms [2017-09-27 16:18:06.322], 1 requests, total requestcount 115, pid 31071
Execution blocked for 358 ms [2017-09-27 16:18:12.570], 0 requests, total requestcount 123, pid 31071
Execution blocked for 1578 ms [2017-09-27 16:18:15.551], 10 requests, total requestcount 147, pid 31071
Execution blocked for 872 ms [2017-09-27 16:18:35.926], 0 requests, total requestcount 557, pid 31077
Execution blocked for 117 ms [2017-09-27 16:20:11.668], 0 requests, total requestcount 761, pid 31077
Execution blocked for 381 ms [2017-09-27 16:23:00.268], 0 requests, total requestcount 2231, pid 31077
Execution blocked for 1206 ms [2017-09-27 16:23:06.096], 2 requests, total requestcount 3147, pid 31070
Execution blocked for 505 ms [2017-09-27 16:23:10.319], 2 requests, total requestcount 2256, pid 31077
Execution blocked for 475 ms [2017-09-27 16:23:10.335], 1 requests, total requestcount 840, pid 31071
Execution blocked for 2113 ms [2017-09-27 16:23:16.918], 1 requests, total requestcount 2283, pid 31077
Execution blocked for 303 ms [2017-09-27 16:23:20.071], 0 requests, total requestcount 3261, pid 31070
Execution blocked for 423 ms [2017-09-27 16:23:23.417], 1 requests, total requestcount 3267, pid 31070
Execution blocked for 6395 ms [2017-09-27 16:23:31.633], 7 requests, total requestcount 3285, pid 31070
Execution blocked for 210 ms [2017-09-27 16:32:04.764], 10 requests, total requestcount 3306, pid 31071
Execution blocked for 690 ms [2017-09-27 16:32:05.945], 1 requests, total requestcount 3313, pid 31071
Execution blocked for 704 ms [2017-09-27 16:32:05.948], 5 requests, total requestcount 5214, pid 31077
Execution blocked for 857 ms [2017-09-27 16:32:07.082], 0 requests, total requestcount 3315, pid 31071
Execution blocked for 1475 ms [2017-09-27 16:32:12.691], 0 requests, total requestcount 3333, pid 31071
Execution blocked for 1487 ms [2017-09-27 16:32:12.692], 1 requests, total requestcount 5247, pid 31077
Execution blocked for 125 ms [2017-09-27 16:32:16.306], 0 requests, total requestcount 7921, pid 31070
Execution blocked for 189 ms [2017-09-27 16:33:16.369], 0 requests, total requestcount 8087, pid 31070
Execution blocked for 182 ms [2017-09-27 16:33:16.621], 0 requests, total requestcount 8087, pid 31070

strace example:

epoll_wait(6, [], 1024, 70)             = 0
epoll_wait(6, [], 1024, 70)             = 0
epoll_wait(6, [], 1024, 70)             = 0
write(2, "Execution blocked for 724 ms [20"..., 103) = 103
epoll_wait(6, [{EPOLLIN, {u32=24, u64=24}}], 1024, 70) = 1
read(24, "", 1024)                      = 0
epoll_ctl(6, EPOLL_CTL_DEL, 24, 0x7fff8ef58de0) = 0
close(24)                               = 0
epoll_wait(6, [], 1024, 0)              = 0
epoll_wait(6, [], 1024, 69)             = 0
epoll_wait(6, [], 1024, 70)             = 0

Also there is plenty of memory and CPU available (3 core):

top - 16:36:50 up 6 days,  5:51,  4 users,  load average: 0.17, 0.37, 0.45
Tasks: 137 total,   1 running, 136 sleeping,   0 stopped,   0 zombie
%Cpu(s):  8.3 us,  0.6 sy,  0.0 ni, 91.0 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  4562340 total,   170144 free,  2234000 used,  2158196 buff/cache
KiB Swap:  1048572 total,   993992 free,    54580 used.  2075596 avail Mem

I also set up GC monitoring, but following 100ms is rarely reached:

const obs = new PerformanceObserver((list) => {
    let gc = list.getEntries()[0];
    if (gc.duration > 100) {
        console.warn('gc', gc);                 
    }
    performance.clearGC();
});
obs.observe({ entryTypes: ['gc'] });

Does express or some of modules it uses cause the blocking even there is seemingly nothing happening? How to debug that? If not, is it Node.js itself? If not, what? As not blocking the event loop is basics of Node.js I'd presume there are tools to debug this out but couldn't find any.

Edit: Tested with both spdy and native https module, no difference.

Edit: Source code:

"use strict";

const   bodyParser = require('body-parser'),
        cluster = require('cluster'),
        cors = require('cors'),
        compress = require('compression'),
        cookieParser = require('cookie-parser'),
        express = require('express'),
        favicon = require('serve-favicon'),
        fs = require('fs'),
        http = require('http'),
//        https = require('spdy'),
        https = require('https'),
        path = require('path'),
        strftime = require('strftime');

const {
        performance,
        PerformanceObserver
} = require('perf_hooks');

global.V = {};

const workers = process.argv[3] || 3;

function blocked(interval, cb, cb_ok) {
        var start = process.hrtime();

        setInterval(function(){
                let delta = process.hrtime(start);
                let nanosec = delta[0] * 1e9 + delta[1];
                let ms = nanosec / 1e6;
                let n = ms - interval;
                if (n > interval) {
                        cb(Math.round(n));
                }
                else if (cb_ok) {
                        cb_ok(Math.round(n));
                }
                start = process.hrtime();
                V.httpRequests2 = 0;
        }, interval).unref();
}

if (cluster.isMaster) {
        console.log(`Master ${process.pid} is running`);

        // Fork workers.
        for (let i = 0; i < workers; i++) {
                cluster.fork();
        }

        cluster.on('exit', (worker, code, signal) => {
                console.log(`worker ${worker.process.pid} died`);
                setTimeout(function() {
                        console.log('Fork one replacement worker...');
                        cluster.fork();
                }, 120000);
        });
}
else {
    V.expressOptions = {
           key: fs.readFileSync('./ssl/server.key'),
           cert: fs.readFileSync('./ssl/ssl-blunde.crt'),
           requestCert: false,
           rejectUnauthorized: false
    };

    V.expressApp = express();

    V.server_ssl = https.createServer(V.expressOptions, V.expressApp);
    V.server_ssl.listen(8080);
    V.expressApp.use(cors({origin: 'https://example.com'}));
    V.expressApp.disable('x-powered-by');
    V.expressApp.use(compress());
    V.expressApp.use(cookieParser());
    V.expressApp.use(favicon(__dirname + '/static/html/favicon.ico'));
    V.expressApp.use(bodyParser.json());
    V.expressApp.use(bodyParser.urlencoded({ extended: true }));

    V.httpRequests = 0;
    V.httpRequests2 = 0;
    V.expressApp.use('*', function(req, res, next) {
    V.httpRequests2++;
            V.httpRequests++;
            next();
    });
    V.expressApp.use('/', express.static(path.join(__dirname, 'static/html')));

    V.expressApp.use(express.static(path.join(__dirname, 'static'), {
         maxAge: 1000 * 60 * 60
    }));
    V.expressApp.use(function (err, req, res, next) {
            console.error(err.stack);
            next(err);
    });
    V.expressApp.use(function (err, req, res, next) {
           if (req.xhr) {
                   console.log('Express error', err);
                   res.status(500).send({ error: 'Something blew up!' });
           }
           else {
                   next(err);
           }
    });
    V.expressApp.use(function (err, req, res, next) {
           console.log('Express error 500', err);
           res.status(500);
    });

    console.log(`Worker ${process.pid} started`);
    blocked(70, function(ms) {
           if (ms > 2500) {
                   console.error('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
           }
           else if (ms > 500) {
                   console.warn('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
                   if (V.httpRequests > 200000) {
                           console.log('Enough requests, exit, requestcount %s, pid %s', V.httpRequests, process.pid);
                           process.exit();
                   }
           }
           else {
                console.log('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
           }
   });
   const obs = new PerformanceObserver((list) => {
           let gc = list.getEntries()[0];
           if (gc.duration > 500) {
                   console.warn('GC', gc);
           }
           else if (gc.duration > 100) {
                   console.log('GC', gc.duration);
           }
           performance.clearGC();
   });
   obs.observe({ entryTypes: ['gc'] });     
}

Edit: It seems that it is related how Node.js communicates with it's threads: Following futex-EAGAIN behaviour happens every time when event loop is blocked. So clearly Node.js is waiting something which practically blocks event loop. The problem is not any I/O as there are no blocking at any of the threads.

782050 16:14:56.945451111 5 node (17387) < futex res=0
782051 16:14:56.945493832 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782052 16:14:56.945494164 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782053 16:14:56.945494233 3 node (17385) < futex res=-11(EAGAIN)
782054 16:14:56.945494712 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782055 16:14:56.945494814 5 node (17387) < futex res=0
782056 16:14:56.945494872 3 node (17385) < futex res=0
782057 16:14:56.945495204 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782058 16:14:56.945495491 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782059 16:14:56.945495541 3 node (17385) < futex res=-11(EAGAIN)
782060 16:14:56.945495941 5 node (17387) < futex res=0
782061 16:14:56.945495992 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782062 16:14:56.945496239 3 node (17385) < futex res=0
782063 16:14:56.945496460 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782064 16:14:56.945496661 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782065 16:14:56.945496780 3 node (17385) < futex res=-11(EAGAIN)
782066 16:14:56.945497107 5 node (17387) < futex res=0
782067 16:14:56.945497232 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782068 16:14:56.945497381 3 node (17385) < futex res=0
782069 16:14:56.945497596 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782070 16:14:56.945497764 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782071 16:14:56.945497913 3 node (17385) < futex res=-11(EAGAIN)
782072 16:14:56.945498204 5 node (17387) < futex res=0

node.js or express should not be blocking the event loop when there is nothing to do. There may occasionally be a bit of time used for garbage collection, but I would not expect that to take as long as the 6395ms you observed.

The built-in means of serving static files uses only asynchronous I/O so that should not be blocking the event loop either.

If you want further help diagnosing what else in your app might be causing this, then you will likely have to show us your code.

In answer to your direct question:

Does Node.js or express itself block event loop?

No, other than very short periods of time for garbage collection. If your server is using very large amounts of Javascript objects and is very, very busy it is occasionally possible for garbage collection to get behind and take a bit of time to catch up, but that would only be in a very, very busy server with code that was using lots of objects (thus creating lots of GC work).

FYI, if all you are doing is serving static files in a high load environment, then there are more efficient ways of serving static files than using Express. One common method is by putting Nginx in front of the express server and using Nginx to serve the static files directly from the file system. There are also CDNs for larger scale situations.

For further help, please show us your actual Express code so we can see what your server is doing.

To answer my own question: there can be a scenario where all worker threads are busy . This wasn't my case, however.

There is something different how Node.js 8 operates with worker threads compared to previous versions. Downgrading to Node 7.10.1 completely solved the problem. As the problem exists also in simple Express server, I'd conclude it is a bug in Node 8.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM