[英]Does Node.js or express itself block event loop?
我有一個非常簡單的Node.js 8.5.0 Express 4.15.5服務器,帶有群集模塊,可提供靜態文件。 問題在於,似乎事件循環有時會阻塞太長時間。 我正在使用阻止的模塊,並且還使用超時間隔70ms跟蹤自上次檢查以來已處理了多少個請求。 很多時候計數器只是零:在沒有請求的情況下,事件循環有時會阻塞一秒鍾。
日志:
Execution blocked for 1056 ms [2017-09-27 16:18:06.322], 1 requests, total requestcount 115, pid 31071
Execution blocked for 358 ms [2017-09-27 16:18:12.570], 0 requests, total requestcount 123, pid 31071
Execution blocked for 1578 ms [2017-09-27 16:18:15.551], 10 requests, total requestcount 147, pid 31071
Execution blocked for 872 ms [2017-09-27 16:18:35.926], 0 requests, total requestcount 557, pid 31077
Execution blocked for 117 ms [2017-09-27 16:20:11.668], 0 requests, total requestcount 761, pid 31077
Execution blocked for 381 ms [2017-09-27 16:23:00.268], 0 requests, total requestcount 2231, pid 31077
Execution blocked for 1206 ms [2017-09-27 16:23:06.096], 2 requests, total requestcount 3147, pid 31070
Execution blocked for 505 ms [2017-09-27 16:23:10.319], 2 requests, total requestcount 2256, pid 31077
Execution blocked for 475 ms [2017-09-27 16:23:10.335], 1 requests, total requestcount 840, pid 31071
Execution blocked for 2113 ms [2017-09-27 16:23:16.918], 1 requests, total requestcount 2283, pid 31077
Execution blocked for 303 ms [2017-09-27 16:23:20.071], 0 requests, total requestcount 3261, pid 31070
Execution blocked for 423 ms [2017-09-27 16:23:23.417], 1 requests, total requestcount 3267, pid 31070
Execution blocked for 6395 ms [2017-09-27 16:23:31.633], 7 requests, total requestcount 3285, pid 31070
Execution blocked for 210 ms [2017-09-27 16:32:04.764], 10 requests, total requestcount 3306, pid 31071
Execution blocked for 690 ms [2017-09-27 16:32:05.945], 1 requests, total requestcount 3313, pid 31071
Execution blocked for 704 ms [2017-09-27 16:32:05.948], 5 requests, total requestcount 5214, pid 31077
Execution blocked for 857 ms [2017-09-27 16:32:07.082], 0 requests, total requestcount 3315, pid 31071
Execution blocked for 1475 ms [2017-09-27 16:32:12.691], 0 requests, total requestcount 3333, pid 31071
Execution blocked for 1487 ms [2017-09-27 16:32:12.692], 1 requests, total requestcount 5247, pid 31077
Execution blocked for 125 ms [2017-09-27 16:32:16.306], 0 requests, total requestcount 7921, pid 31070
Execution blocked for 189 ms [2017-09-27 16:33:16.369], 0 requests, total requestcount 8087, pid 31070
Execution blocked for 182 ms [2017-09-27 16:33:16.621], 0 requests, total requestcount 8087, pid 31070
strace示例:
epoll_wait(6, [], 1024, 70) = 0
epoll_wait(6, [], 1024, 70) = 0
epoll_wait(6, [], 1024, 70) = 0
write(2, "Execution blocked for 724 ms [20"..., 103) = 103
epoll_wait(6, [{EPOLLIN, {u32=24, u64=24}}], 1024, 70) = 1
read(24, "", 1024) = 0
epoll_ctl(6, EPOLL_CTL_DEL, 24, 0x7fff8ef58de0) = 0
close(24) = 0
epoll_wait(6, [], 1024, 0) = 0
epoll_wait(6, [], 1024, 69) = 0
epoll_wait(6, [], 1024, 70) = 0
另外,還有足夠的內存和CPU可用(3核):
top - 16:36:50 up 6 days, 5:51, 4 users, load average: 0.17, 0.37, 0.45
Tasks: 137 total, 1 running, 136 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.3 us, 0.6 sy, 0.0 ni, 91.0 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 4562340 total, 170144 free, 2234000 used, 2158196 buff/cache
KiB Swap: 1048572 total, 993992 free, 54580 used. 2075596 avail Mem
我還設置了GC監視,但很少達到100ms:
const obs = new PerformanceObserver((list) => {
let gc = list.getEntries()[0];
if (gc.duration > 100) {
console.warn('gc', gc);
}
performance.clearGC();
});
obs.observe({ entryTypes: ['gc'] });
Express或它使用的某些模塊是否造成了阻塞,即使似乎什么也沒有發生? 如何調試呢? 如果不是,那是Node.js本身嗎? 如果沒有,那怎么辦? 因為不阻塞事件循環是Node.js的基礎,所以我認為有一些工具可以對此進行調試,但找不到任何工具。
編輯:使用spdy和本機https模塊測試,沒有區別。
編輯:源代碼:
"use strict";
const bodyParser = require('body-parser'),
cluster = require('cluster'),
cors = require('cors'),
compress = require('compression'),
cookieParser = require('cookie-parser'),
express = require('express'),
favicon = require('serve-favicon'),
fs = require('fs'),
http = require('http'),
// https = require('spdy'),
https = require('https'),
path = require('path'),
strftime = require('strftime');
const {
performance,
PerformanceObserver
} = require('perf_hooks');
global.V = {};
const workers = process.argv[3] || 3;
function blocked(interval, cb, cb_ok) {
var start = process.hrtime();
setInterval(function(){
let delta = process.hrtime(start);
let nanosec = delta[0] * 1e9 + delta[1];
let ms = nanosec / 1e6;
let n = ms - interval;
if (n > interval) {
cb(Math.round(n));
}
else if (cb_ok) {
cb_ok(Math.round(n));
}
start = process.hrtime();
V.httpRequests2 = 0;
}, interval).unref();
}
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
// Fork workers.
for (let i = 0; i < workers; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
console.log(`worker ${worker.process.pid} died`);
setTimeout(function() {
console.log('Fork one replacement worker...');
cluster.fork();
}, 120000);
});
}
else {
V.expressOptions = {
key: fs.readFileSync('./ssl/server.key'),
cert: fs.readFileSync('./ssl/ssl-blunde.crt'),
requestCert: false,
rejectUnauthorized: false
};
V.expressApp = express();
V.server_ssl = https.createServer(V.expressOptions, V.expressApp);
V.server_ssl.listen(8080);
V.expressApp.use(cors({origin: 'https://example.com'}));
V.expressApp.disable('x-powered-by');
V.expressApp.use(compress());
V.expressApp.use(cookieParser());
V.expressApp.use(favicon(__dirname + '/static/html/favicon.ico'));
V.expressApp.use(bodyParser.json());
V.expressApp.use(bodyParser.urlencoded({ extended: true }));
V.httpRequests = 0;
V.httpRequests2 = 0;
V.expressApp.use('*', function(req, res, next) {
V.httpRequests2++;
V.httpRequests++;
next();
});
V.expressApp.use('/', express.static(path.join(__dirname, 'static/html')));
V.expressApp.use(express.static(path.join(__dirname, 'static'), {
maxAge: 1000 * 60 * 60
}));
V.expressApp.use(function (err, req, res, next) {
console.error(err.stack);
next(err);
});
V.expressApp.use(function (err, req, res, next) {
if (req.xhr) {
console.log('Express error', err);
res.status(500).send({ error: 'Something blew up!' });
}
else {
next(err);
}
});
V.expressApp.use(function (err, req, res, next) {
console.log('Express error 500', err);
res.status(500);
});
console.log(`Worker ${process.pid} started`);
blocked(70, function(ms) {
if (ms > 2500) {
console.error('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
}
else if (ms > 500) {
console.warn('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
if (V.httpRequests > 200000) {
console.log('Enough requests, exit, requestcount %s, pid %s', V.httpRequests, process.pid);
process.exit();
}
}
else {
console.log('Execution blocked for ' + ms + ' ms [' + strftime('%F %T.%L') + '], %s requests, total requestcount %s, pid %s', V.httpRequests2, V.httpRequests, process.pid);
}
});
const obs = new PerformanceObserver((list) => {
let gc = list.getEntries()[0];
if (gc.duration > 500) {
console.warn('GC', gc);
}
else if (gc.duration > 100) {
console.log('GC', gc.duration);
}
performance.clearGC();
});
obs.observe({ entryTypes: ['gc'] });
}
編輯:似乎與Node.js與線程的通信方式有關:每次阻止事件循環時,都會發生以下futex-EAGAIN行為。 所以很明顯的Node.js正在等待一些東西 ,實際上是塊事件循環。 問題不在於任何I / O,因為在任何線程中都沒有阻塞。
782050 16:14:56.945451111 5 node (17387) < futex res=0
782051 16:14:56.945493832 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782052 16:14:56.945494164 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782053 16:14:56.945494233 3 node (17385) < futex res=-11(EAGAIN)
782054 16:14:56.945494712 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782055 16:14:56.945494814 5 node (17387) < futex res=0
782056 16:14:56.945494872 3 node (17385) < futex res=0
782057 16:14:56.945495204 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782058 16:14:56.945495491 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782059 16:14:56.945495541 3 node (17385) < futex res=-11(EAGAIN)
782060 16:14:56.945495941 5 node (17387) < futex res=0
782061 16:14:56.945495992 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782062 16:14:56.945496239 3 node (17385) < futex res=0
782063 16:14:56.945496460 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782064 16:14:56.945496661 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782065 16:14:56.945496780 3 node (17385) < futex res=-11(EAGAIN)
782066 16:14:56.945497107 5 node (17387) < futex res=0
782067 16:14:56.945497232 3 node (17385) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782068 16:14:56.945497381 3 node (17385) < futex res=0
782069 16:14:56.945497596 3 node (17385) > futex addr=7F8F03C8FB20 op=128(FUTEX_PRIVATE_FLAG) val=2
782070 16:14:56.945497764 5 node (17387) > futex addr=7F8F03C8FB20 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1
782071 16:14:56.945497913 3 node (17385) < futex res=-11(EAGAIN)
782072 16:14:56.945498204 5 node (17387) < futex res=0
無事可做時,node.js或express不應阻塞事件循環。 有時可能會花費一些時間來進行垃圾收集,但是我不希望花費那么多的時間來觀察到6395ms
。
提供靜態文件的內置方法僅使用異步I / O,因此也不應阻塞事件循環。
如果您需要進一步的幫助來診斷您的應用程序中還有其他原因,那么您可能必須向我們展示您的代碼。
在回答您的直接問題時:
Node.js或表示自己會阻止事件循環嗎?
不,除了很短的垃圾收集時間。 如果您的服務器使用了大量的Javascript對象,並且非常繁忙,則偶爾可能會導致垃圾回收落后並花費一些時間來追趕,但這只能在一個非常繁忙的服務器中進行,使用大量對象的代碼(因此創建了大量GC工作)。
僅供參考,如果您要做的只是在高負載環境中提供靜態文件,那么提供靜態文件的方法比使用Express更為有效。 一種常見的方法是將Nginx放在快速服務器的前面,並使用Nginx直接從文件系統提供靜態文件。 也有適用於大規模情況的CDN。
要獲得更多幫助,請向我們顯示您的實際Express代碼,以便我們查看您的服務器在做什么。
要回答我自己的問題:可能存在所有工作線程都處於繁忙狀態的情況 。 但是,這不是我的情況。
與以前的版本相比,Node.js 8在工作線程上的操作方式有所不同。 降級到Node 7.10.1可以完全解決該問題。 由於該問題也存在於簡單的Express服務器中,因此我認為這是Node 8中的錯誤。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.