简体   繁体   English

使node.js在出错时不退出

[英]Make node.js not exit on error

I am working on a websocket oriented node.js server using Socket.IO. 我正在使用Socket.IO在面向websocket的node.js服务器上工作。 I noticed a bug where certain browsers aren't following the correct connect procedure to the server, and the code isn't written to gracefully handle it, and in short, it calls a method to an object that was never set up, thus killing the server due to an error. 我注意到一个错误,某些浏览器没有遵循正确的连接过程到服务器,并且代码没有被写入以优雅地处理它,简而言之,它调用一个方法到一个从未设置的对象,从而杀死服务器由于错误。

My concern isn't with the bug in particular, but the fact that when such errors occur, the entire server goes down. 我关心的不是特别是bug,而是当发生这样的错误时,整个服务器都会崩溃。 Is there anything I can do on a global level in node to make it so if an error occurs it will simply log a message, perhaps kill the event, but the server process will keep on running? 有什么我可以在节点的全局级别上做到这一点,如果发生错误它只会记录一条消息,可能会杀死事件,但服务器进程将继续运行?

I don't want other users' connections to go down due to one clever user exploiting an uncaught error in a large included codebase. 我不希望其他用户的连接因为一个聪明的用户在大型包含的代码库中利用未被捕获的错误而关闭。

You can attach a listener to the uncaughtException event of the process object. 您可以将侦听器附加到进程对象的uncaughtException事件。

Code taken from the actual Node.js API reference (it's the second item under "process"): 代码取自实际的Node.js API参考 (它是“process”下的第二项):

process.on('uncaughtException', function (err) {
  console.log('Caught exception: ', err);
});

setTimeout(function () {
  console.log('This will still run.');
}, 500);

// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');

All you've got to do now is to log it or do something with it, in case you know under what circumstances the bug occurs, you should file a bug over at Socket.IO's GitHub page: 你现在要做的就是记录它或用它做一些事情,如果你知道在什么情况下发生了bug,你应该在Socket.IO的GitHub页面上提交一个bug:
https://github.com/LearnBoost/Socket.IO-node/issues https://github.com/LearnBoost/Socket.IO-node/issues

Using uncaughtException is a very bad idea. 使用uncaughtException是一个非常糟糕的主意。

The best alternative is to use domains in Node.js 0.8. 最好的选择是在Node.js 0.8中使用域。 If you're on an earlier version of Node.js rather use forever to restart your processes or even better use node cluster to spawn multiple worker processes and restart a worker on the event of an uncaughtException. 如果您使用的是早期版本的Node.js而是永远使用重新启动进程,甚至更好地使用节点集群来生成多个工作进程,并在发生uncaughtException事件时重新启动工作程序。

From: http://nodejs.org/api/process.html#process_event_uncaughtexception 来自: http//nodejs.org/api/process.html#process_event_uncaughtexception

Warning: Using 'uncaughtException' correctly 警告:正确使用'uncaughtException'

Note that 'uncaughtException' is a crude mechanism for exception handling intended to be used only as a last resort. 请注意,'uncaughtException'是异常处理的粗略机制,旨在仅用作最后的手段。 The event should not be used as an equivalent to On Error Resume Next. 该事件不应用作On Error Resume Next的等效项。 Unhandled exceptions inherently mean that an application is in an undefined state. 未处理的异常本身意味着应用程序处于未定义状态。 Attempting to resume application code without properly recovering from the exception can cause additional unforeseen and unpredictable issues. 尝试在未正确恢复异常的情况下恢复应用程序代码可能会导致其他无法预料和不可预测的问题。

Exceptions thrown from within the event handler will not be caught. 将不会捕获从事件处理程序中抛出的异常。 Instead the process will exit with a non-zero exit code and the stack trace will be printed. 相反,该过程将以非零退出代码退出,并且将打印堆栈跟踪。 This is to avoid infinite recursion. 这是为了避免无限递归。

Attempting to resume normally after an uncaught exception can be similar to pulling out of the power cord when upgrading a computer -- nine out of ten times nothing happens - but the 10th time, the system becomes corrupted. 在未被捕获的异常之后尝试正常恢复可能类似于在升级计算机时拔出电源线 - 十分之九没有任何事情发生 - 但是第10次,系统被破坏。

The correct use of 'uncaughtException' is to perform synchronous cleanup of allocated resources (eg file descriptors, handles, etc) before shutting down the process. 正确使用'uncaughtException'是在关闭进程之前执行已分配资源(例如文件描述符,句柄等)的同步清理。 It is not safe to resume normal operation after 'uncaughtException'. 'uncaughtException'后恢复正常操作是不安全的。

To restart a crashed application in a more reliable way, whether uncaughtException is emitted or not, an external monitor should be employed in a separate process to detect application failures and recover or restart as needed. 要以更可靠的方式重新启动崩溃的应用程序,无论是否发出uncaughtException,都应在单独的进程中使用外部监视器来检测应用程序故障并根据需要进行恢复或重新启动。

I just did a bunch of research on this (see here , here , here , and here ) and the answer to your question is that Node will not allow you to write one error handler that will catch every error scenario that could possibly occur in your system. 我刚刚对此进行了大量研究(参见此处此处此处此处 ),您的问题的答案是Node不会允许您编写一个错误处理程序来捕获可能发生在您的每个错误情况系统。

Some frameworks like express will allow you to catch certain types of errors (when an async method returns an error object), but there are other conditions that you cannot catch with a global error handler. express这样的一些框架将允许您捕获某些类型的错误(当异步方法返回错误对象时),但是还有其他条件无法用全局错误处理程序捕获。 This is a limitation (in my opinion) of Node and possibly inherent to async programming in general. 这是Node的限制(在我看来),一般来说可能是异步编程所固有的。

For example, say you have the following express handler: 例如,假设您有以下快递处理程序:

app.get("/test", function(req, res, next) {
    require("fs").readFile("/some/file", function(err, data) {
        if(err)
            next(err);
        else
            res.send("yay");
    });
});

Let's say that the file "some/file" does not actually exist. 假设文件“some / file”实际上并不存在。 In this case fs.readFile will return an error as the first argument to the callback method. 在这种情况下,fs.readFile将返回错误作为回调方法的第一个参数。 If you check for that and do next(err) when it happens, the default express error handler will take over and do whatever you make it do (eg return a 500 to the user). 如果你检查它并在它发生时做下一个(错误),默认的快速错误处理程序将接管并做你做的任何事情(例如,向用户返回500)。 That's a graceful way to handle an error. 这是处理错误的优雅方式。 Of course, if you forget to call next(err) , it doesn't work. 当然,如果你忘记打电话给next(err) ,它就不起作用了。

So that's the error condition that a global handler can deal with, however consider another case: 这是全局处理程序可以处理的错误条件,但考虑另一种情况:

app.get("/test", function(req, res, next) {
    require("fs").readFile("/some/file", function(err, data) {
        if(err)
            next(err);
        else {
            nullObject.someMethod(); //throws a null reference exception
            res.send("yay");
        }
    });
});

In this case, there is a bug if your code that results in you calling a method on a null object. 在这种情况下,如果您的代码导致您在null对象上调用方法,则会出现错误。 Here an exception will be thrown, it will not be caught by the global error handler, and your node app will terminate. 这里将抛出异常,它不会被全局错误处理程序捕获,并且您的节点应用程序将终止。 All clients currently executing requests on that service will get suddenly disconnected with no explanation as to why. 当前正在执行该服务请求的所有客户端将突然断开连接,无法解释原因。 Ungraceful. 不适度。

There is currently no global error handler functionality in Node to handle this case. Node中目前没有全局错误处理程序功能来处理这种情况。 You cannot put a giant try/catch around all your express handlers because by the time your asyn callback executes, those try/catch blocks are no longer in scope. 你不能在你的所有快速处理程序中放置一个巨大的try/catch ,因为当你的asyn回调执行时,那些try/catch块不再在范围内。 That's just the nature of async code, it breaks the try/catch error handling paradigm. 这只是异步代码的本质,它打破了try / catch错误处理范例。

AFAIK, your only recourse here is to put try/catch blocks around the synchronous parts of your code inside each one of your async callbacks, something like this: AFAIK,你唯一的办法是在每个异步回调中围绕代码的同步部分放置try/catch块,如下所示:

app.get("/test", function(req, res, next) {
    require("fs").readFile("/some/file", function(err, data) {
        if(err) {
            next(err);
        }
        else {
            try {
                nullObject.someMethod(); //throws a null reference exception
                res.send("yay");
            }
            catch(e) {
                res.send(500);
            }
        }
    });
});

That's going to make for some nasty code, especially once you start getting into nested async calls. 这将会产生一些讨厌的代码,特别是一旦你开始进入嵌套的异步调用。

Some people think that what Node does in these cases (that is, die) is the proper thing to do because your system is in an inconsistent state and you have no other option. 有些人认为Node在这些情况下(即死亡)的作用是正确的,因为你的系统处于不一致的状态而你没有其他选择。 I disagree with that reasoning but I won't get into a philosophical debate about it. 我不同意这种推理,但我不会进入关于它的哲学辩论。 The point is that with Node, your options are lots of little try/catch blocks or hope that your test coverage is good enough so that this doesn't happen. 关键在于使用Node,你的选择是很多小的try/catch块,或者希望你的测试覆盖率足够好,这样就不会发生这种情况。 You can put something like upstart or supervisor in place to restart your app when it goes down but that's simply mitigation of the problem, not a solution. 您可以使用诸如暴发户主管之类的东西来重启您的应用程序,但这只是缓解问题,而不是解决方案。

Node.js has a currently unstable feature called domains that appears to address this issue, though I don't know much about it. Node.js目前有一个名为域的不稳定功能,似乎可以解决这个问题,尽管我对此并不了解。

I've just put together a class which listens for unhandled exceptions, and when it see's one it: 我刚刚组建了一个监听未处理异常的类,当它看到它的时候:

  • prints the stack trace to the console 将堆栈跟踪打印到控制台
  • logs it in it's own logfile 将其记录在自己的日志文件中
  • emails you the stack trace 通过电子邮件向您发送堆栈跟踪
  • restarts the server (or kills it, up to you) 重新启动服务器(或杀死它,由你决定)

It will require a little tweaking for your application as I haven't made it generic as yet, but it's only a few lines and it might be what you're looking for! 它需要对你的应用程序进行一些调整,因为我还没有把它变成通用的,但它只有几行,它可能就是你想要的!

Check it out! 看看这个!

Note: this is over 4 years old at this point, unfinished, and there may now be a better way - I don't know!) 注意:此时已超过4年,未完成,现在可能有更好的方法 - 我不知道!)

process.on
(
    'uncaughtException',
    function (err)
    {
        var stack = err.stack;
        var timeout = 1;

        // print note to logger
        logger.log("SERVER CRASHED!");
        // logger.printLastLogs();
        logger.log(err, stack);


        // save log to timestamped logfile
        // var filename = "crash_" + _2.formatDate(new Date()) + ".log";
        // logger.log("LOGGING ERROR TO "+filename);
        // var fs = require('fs');
        // fs.writeFile('logs/'+filename, log);


        // email log to developer
        if(helper.Config.get('email_on_error') == 'true')
        {
            logger.log("EMAILING ERROR");
            require('./Mailer'); // this is a simple wrapper around nodemailer http://documentup.com/andris9/nodemailer/
            helper.Mailer.sendMail("GAMEHUB NODE SERVER CRASHED", stack);
            timeout = 10;
        }

        // Send signal to clients
//      logger.log("EMITTING SERVER DOWN CODE");
//      helper.IO.emit(SIGNALS.SERVER.DOWN, "The server has crashed unexpectedly. Restarting in 10s..");


        // If we exit straight away, the write log and send email operations wont have time to run
        setTimeout
        (
            function()
            {
                logger.log("KILLING PROCESS");
                process.exit();
            },
            // timeout * 1000
            timeout * 100000 // extra time. pm2 auto-restarts on crash...
        );
    }
);

Had a similar problem. 有类似的问题。 Ivo's answer is good. 伊沃的答案很好。 But how can you catch an error in a loop and continue? 但是如何在循环中捕获错误并继续?

var folder='/anyFolder';
fs.readdir(folder, function(err,files){
    for(var i=0; i<files.length; i++){
        var stats = fs.statSync(folder+'/'+files[i]);
    }
});

Here, fs.statSynch throws an error (against a hidden file in Windows that barfs I don't know why). 在这里,fs.statSynch抛出一个错误(针对Windows中的一个隐藏文件barfs我不知道为什么)。 The error can be caught by the process.on(...) trick, but the loop stops. process.on(...)技巧可以捕获错误,但循环停止。

I tried adding a handler directly: 我尝试直接添加处理程序:

var stats = fs.statSync(folder+'/'+files[i]).on('error',function(err){console.log(err);});

This did not work either. 这也不起作用。

Adding a try/catch around the questionable fs.statSynch() was the best solution for me: 在可疑的fs.statSynch()周围添加一个try / catch对我来说是最好的解决方案:

var stats;
try{
    stats = fs.statSync(path);
}catch(err){console.log(err);}

This then led to the code fix (making a clean path var from folder and file). 然后导致代码修复(从文件夹和文件中创建一个干净的路径var)。

我发现PM2是处理节点服务器,单个和多个实例的最佳解决方案

One way of doing this would be spinning the child process and communicate with the parent process via 'message' event. 这样做的一种方法是旋转子进程并通过'message'事件与父进程通信。

In the child process where the error occurs, catch that with 'uncaughtException' to avoid crashing the application. 在发生错误的子进程中,使用'uncaughtException'捕获它以避免崩溃应用程序。 Mind that Exceptions thrown from within the event handler will not be caught . 请注意, 不会捕获从事件处理程序中抛出的异常 Once the error is caught safely, send a message like: {finish: false} . 一旦安全捕获到错误,请发送如下消息: {finish:false}

Parent Process would listen to the message event and send the message again to the child process to re-run the function. 父进程将侦听消息事件并再次将消息发送到子进程以重新运行该功能。

Child Process: 儿童过程:

// In child.js
// function causing an exception
  const errorComputation = function() {

        for (let i = 0; i < 50; i ++) {
            console.log('i is.......', i);
            if (i === 25) {
                throw new Error('i = 25');
            }
        }
        process.send({finish: true});
}

// Instead the process will exit with a non-zero exit code and the stack trace will be printed. This is to avoid infinite recursion.
process.on('uncaughtException', err => {
   console.log('uncaught exception..',err.message);
   process.send({finish: false});
});

// listen to the parent process and run the errorComputation again
process.on('message', () => {
    console.log('starting process ...');
    errorComputation();
})

Parent Process: 家长流程:

// In parent.js
    const { fork } = require('child_process');

    const compute = fork('child.js');

    // listen onto the child process
    compute.on('message', (data) => {
        if (!data.finish) {
            compute.send('start');
        } else {
            console.log('Child process finish successfully!')
        }
    });

    // send initial message to start the child process. 
    compute.send('start'); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM