简体   繁体   English

为什么Node cluster.fork()在作为模块实现时分支父作用域

[英]Why is Node cluster.fork() forking the parent scope when implemented as a module

I'm attempting to implement a Node module which uses cluster . 我正在尝试实现一个使用cluster的Node模块。 The problem is that the entire parent scope is forked alongside the intended cluster code. 问题是整个父作用域与预期的集群代码一起分叉。 I discovered it while writing tests in Mocha for the module: the test suite will run many times, instead of once. 我在Mocha中为模块编写测试时发现了它:测试套件将运行多次,而不是一次。

See below, myModule.js creates N workers, one for each CPU. 如下所示,myModule.js创建N个worker,每个CPU一个。 These workers are http servers, or could be anything else. 这些工作者是http服务器,或者可能是其他任何工作者。

Each time the test.js runs, the script runs N + 1 times. 每次运行test.js时,脚本都会运行N + 1次。 In the example below, console.log runs 5 times on my quad core. 在下面的示例中,console.log在我的四核上运行了5次。

Can someone explain if this is an implementation issue or cluster config issue? 有人可以解释这是实施问题还是群集配置问题? Is there any way to limit the scope of fork() without having to import a module ( as in this solution https://github.com/mochajs/mocha/issues/826 )? 有没有办法限制fork()的范围而不必导入模块(如在这个解决方案https://github.com/mochajs/mocha/issues/826 )?

/// myModule.js ////////////////////////////////////

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

var startCluster = function(){

  if (cluster.isMaster) {
    // CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
    //master does not listen to UDP messages.

    for (var i = 0; i < numCPUs; i++) {

      var worker = cluster.fork();
    }


  } else {
    // Worker processes have an http server.
    http.Server(function (req, res){
      res.writeHead(200);
      res.end('hello world\n');
    }).listen(8000);

  }

  return
}

module.exports = startCluster;

/////////////////////////////////////////////////

//// test.js ////////////////////////////////////

var startCluster = require('./myModule.js')

startCluster()

console.log('hello');

////////////////////////////////////////////////////////

So I'll venture an answer. 所以我会冒险回答。 Looking closer at the node docs there is a cluster.setupMaster which can override defaults. 仔细观察节点文档,有一个cluster.setupMaster可以覆盖默认值。 The default on a cluster.fork() is to execute the current script, with "file path to worker file. (Default=process.argv[1])" https://nodejs.org/docs/latest/api/cluster.html#cluster_cluster_settings cluster.fork()的默认设置是执行当前脚本,“工作文件的文件路径。(默认= process.argv [1])” https://nodejs.org/docs/latest/api/cluster的.html#cluster_cluster_settings

So if another module is importing a script with a cluster.fork() call, it will still use the path of process.argv[1], which may not be the path you expect, and have unintended consequences. 因此,如果另一个模块正在使用cluster.fork()调用导入脚本,它仍将使用process.argv [1]的路径,该路径可能不是您期望的路径,并且会产生意想不到的后果。

So we shouldn't initialize the cluster master and worker in the same file as the official docs suggest. 因此,我们不应该在官方文档建议的同一文件中初始化集群主服务器和工作服务器。 It would be prudent to separate the worker into a new file and override the default settings. 将工作者分成新文件并覆盖默认设置是明智的。 (Also for safety you can add the directory path with __dirname ). (为了安全起见,您可以使用__dirname添加目录路径)。

cluster.setupMaster({ exec: __dirname + '/worker.js',});

So here would be the corrected implementation: 所以这将是更正的实现:

/// myModule.js ////////////////////////////////////

var cluster = require('cluster');
var numCPUs = require('os').cpus().length;

var startCluster = function(){
  cluster.setupMaster({ 
    exec: __dirname + '/worker.js'
  });

  if (cluster.isMaster) {
    // CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU    
    for (var i = 0; i < numCPUs; i++) {
       var worker = cluster.fork();
    }

  } 
  return
}

module.exports = startCluster;

/////////////////////////////////////////////////

//// worker.js ////////////////////////////////////
var http = require('http');

// All worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);

////////////////////////////////////////////////////////

//// test.js ////////////////////////////////////

var startCluster = require('./myModule.js')

startCluster()

console.log('hello');

////////////////////////////////////////////////////////

You should only see 'hello' once instead of 1 * Number of CPUs 您应该只看到'hello'一次而不是1 * CPU数量

You need to have the "isMaster" stuff at the top of your code, not inside the function. 你需要在代码的顶部放置“isMaster”,而不是在函数内部。 The worker will run from the top of the module ( it's not like a C++ fork, where the worker starts at the fork() point ). worker将从模块的顶部运行(它不像C ++ fork,worker在fork()点开始)。

I assume that you want the startCluster = require('./cluster-bug.js') to eval only once? 我假设你想要startCluster = require('。/ cluster-bug.js')只进行一次eval? Well, thats because your whole script runs clustered. 好吧,那就是因为你的整个脚本都是集群的。 What you do specify inside startCluster is only to make it vary between master and slave clusters. 你在startCluster中指定的只是让它在主集群和从集群之间变化。 Cluster spawns fork of file in which it is initialised. 群集生成初始化文件的fork。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM