简体   繁体   English

Web Worker限制数量

[英]Number of Web Workers Limit

PROBLEM 问题

I've discovered that there is a limit on the number of Web Workers that can be spawned by a browser. 我发现浏览器可以生成的Web Workers数量有限。

Example

main HTML / JavaScript 主要的HTML / JavaScript

<script type="text/javascript">
$(document).ready(function(){
    var workers = new Array();
    var worker_index = 0;
    for (var i=0; i < 25; i++) {
        workers[worker_index] = new Worker('test.worker.js');
        workers[worker_index].onmessage = function(event) {
            $("#debug").append('worker.onmessage i = ' + event.data + "<br>");
        };
        workers[worker_index].postMessage(i); // start the worker.      

        worker_index++;
    }   
});
</head>
<body>
<div id="debug">
</div>

test.worker.js test.worker.js

self.onmessage = function(event) {
    var i = event.data; 

    self.postMessage(i);
};

This will generate only 20 output lines in the container when using Firefox (version 14.0.1, Windows 7). 使用Firefox(版本14.0.1,Windows 7)时,这将在容器中仅生成20个输出行。

QUESTION

Is there a way around this? 有没有解决的办法? The only two ideas I can think of are: 我能想到的唯一两个想法是:

1) Daisy chaining the web workers, ie, making each web worker spawn the next one 1)雏菊链接网络工作者,即使每个网络工作者产生下一个

Example: 例:

<script type="text/javascript">
$(document).ready(function(){
    createWorker(0);
});

function createWorker(i) {

    var worker = new Worker('test.worker.js');
    worker.onmessage = function(event) {
        var index = event.data;

        $("#debug").append('worker.onmessage i = ' + index + "<br>");

        if ( index < 25) {
            index++;
            createWorker(index);
        } 
    };
    worker.postMessage(i); // start the worker.
}
</script>
</head>
<body>
<div id="debug"></div>

2) Limit the number of web workers to a finite number and modify my code to work with that limit (ie, share the work load across a finite number of web workers) - something like this: http://www.smartjava.org/content/html5-easily-parallelize-jobs-using-web-workers-and-threadpool 2)将Web工作者的数量限制为有限数量,并修改我的代码以使用该限制(即,在有限数量的Web工作者之间共享工作负载) - 如下所示: http//www.smartjava.org /内容/ HTML5的易于并行化的作业,使用的web -工人-和线程池

Unfortunately #1 doesn't seem to work (only a finite number of web workers will get spawned on a page load). 不幸的是,#1似乎不起作用(只有有限数量的Web工作人员会在页面加载时产生)。 Are there any other solutions I should consider? 我还应该考虑其他解决方案吗?

Old question, let's revive it! 老问题,让我们复活吧! readies epinephrine 准备肾上腺素

I've been looking into using Web Workers to isolate 3rd party plugins since web workers can't access the host page. 我一直在寻找使用Web Workers来隔离第三方插件,因为Web worker无法访问主机页面。 I'll help you out with your methods which I'm sure you've solved by now, but this is for teh internetz. 我会用你的方法帮助你,我相信你现在已经解决了,但这是为了互联网。 Then I'll give some relevant information from my research. 然后我会从我的研究中提供一些相关信息。

Disclaimer : In the examples that I used your code, I've modified and cleaned the code to provide a full source code without jQuery so that you and others can run it easily. 免责声明 :在我使用您的代码的示例中,我修改并清理了代码以提供没有jQuery的完整源代码,以便您和其他人可以轻松地运行它。 I've also added a timer which alerts the time in ms to execute the code. 我还添加了一个计时器,它以ms为单位提醒执行代码的时间。

In all examples, we reference the following genericWorker.js file. 在所有示例中,我们引用以下genericWorker.js文件。

genericWorker.js genericWorker.js

self.onmessage = function(event) {
    self.postMessage(event.data);
};

Method 1 (Linear Execution) 方法1(线性执行)

Your first method is nearly working. 你的第一种方法几乎正常。 The reason why it still fails is that you aren't deleting any workers once you finish with them. 它仍然失败的原因是,一旦你完成它们,你就不会删除任何工人。 This means the same result (crashing) will happen, just slower. 这意味着会发生相同的结果(崩溃),只是更慢。 All you need to fix it is to add worker.terminate(); 你需要解决的就是添加worker.terminate(); before creating a new worker to remove the old one from memory. 在创建新工作程序之前从内存中删除旧工作程序。 Note that this will cause the application to run much slower as each worker must be created, run, and be destroyed before the next can run. 请注意,这将导致应用程序运行慢得多 ,因为每个工人必须创建,运行和销毁先下一可以运行。

Linear.html Linear.html

<!DOCTYPE html>
<html>
<head>
    <title>Linear</title>
</head>
<body>
    <pre id="debug"></pre>
    <script type="text/javascript">
        var debug = document.getElementById('debug');
        var totalWorkers = 250;
        var index = 0;
        var start = (new Date).getTime();

        function createWorker() {
            var worker = new Worker('genericWorker.js');
            worker.onmessage = function(event) {
                debug.appendChild(document.createTextNode('worker.onmessage i = ' + event.data + '\n'));
                worker.terminate();
                if (index < totalWorkers) createWorker(index);
                else alert((new Date).getTime() - start);
            };
            worker.postMessage(index++); // start the worker.
        }

        createWorker();
    </script>
</body>
<html>

Method 2 (Thread Pool) 方法2(线程池)

Using a thread pool should greatly increase running speed. 使用线程池应该可以大大提高运行速度。 Instead of using some library with complex lingo, lets simplify it. 我们不是使用一些带有复杂术语的库,而是简化它。 All the thread pool means is having a set number of workers running simultaneously. 所有线程池均意味着有一定数量的工作程序同时运行。 We can actually just modify a few lines of code from the linear example to get a multi-threaded example. 我们实际上可以从线性示例中修改几行代码以获得多线程示例。 The code below will find how many cores you have (if your browser supports this), or default to 4. I found that this code ran about 6x faster than the original on my machine with 8 cores. 下面的代码将找到您拥有的内核数量(如果您的浏览器支持此内容),或默认为4.我发现此代码的运行速度比具有8个内核的机器上的原始代码快6倍。

ThreadPool.html ThreadPool.html

<!DOCTYPE html>
<html>
<head>
    <title>Thread Pool</title>
</head>
<body>
    <pre id="debug"></pre>
    <script type="text/javascript">
        var debug = document.getElementById('debug');
        var maxWorkers = navigator.hardwareConcurrency || 4;
        var totalWorkers = 250;
        var index = 0;
        var start = (new Date).getTime();

        function createWorker() {
            var worker = new Worker('genericWorker.js');
            worker.onmessage = function(event) {
                debug.appendChild(document.createTextNode('worker.onmessage i = ' + event.data + '\n'));
                worker.terminate();
                if (index < totalWorkers) createWorker();
                else if(--maxWorkers === 0) alert((new Date).getTime() - start);
            };
            worker.postMessage(index++); // start the worker.
        }

        for(var i = 0; i < maxWorkers; i++) createWorker();
    </script>
</body>
<html>

Other Methods 其他方法

Method 3 (Single worker, repeated task) 方法3(单工,重复任务)

In your example, you're using the same worker over and over again. 在您的示例中,您一遍又一遍地使用同一个工作程序。 I know you're simplifying a probably more complex use case, but some people viewing will see this and apply this method when they could be using just one worker for all the tasks. 我知道你正在简化一个可能更复杂的用例,但是一些人在查看时会看到这个并且当他们只使用一个worker来完成所有任务时应用这个方法。

Essentially, we'll instantiate a worker, send data, wait for data, then repeat the send/wait steps until all data has been processed. 本质上,我们将实例化一个worker,发送数据,等待数据,然后重复发送/等待步骤,直到所有数据都被处理完毕。

On my computer, this runs at about twice the speed of the thread pool. 在我的计算机上,它的运行速度大约是线程池的两倍。 That actually surprised me. 那真让我感到惊讶。 I thought the overhead from the thread pool would have caused it to be slower than just 1/2 the speed. 我认为线程池的开销会导致它比速度的1/2慢。

RepeatedWorker.html RepeatedWorker.html

<!DOCTYPE html>
<html>
<head>
    <title>Repeated Worker</title>
</head>
<body>
    <pre id="debug"></pre>
    <script type="text/javascript">
        var debug = document.getElementById('debug');
        var totalWorkers = 250;
        var index = 0;
        var start = (new Date).getTime();
        var worker = new Worker('genericWorker.js');

        function runWorker() {
            worker.onmessage = function(event) {
                debug.appendChild(document.createTextNode('worker.onmessage i = ' + event.data + '\n'));
                if (index < totalWorkers) runWorker();
                else {
                    alert((new Date).getTime() - start);
                    worker.terminate();
                }
            };
            worker.postMessage(index++); // start the worker.
        }

        runWorker();
    </script>
</body>
<html>

Method 4 (Repeated Worker w/ Thread Pool) 方法4(重复的工作者w /线程池)

Now, what if we combine the previous method with the thread pool method? 现在,如果我们将前一个方法与线程池方法结合起来怎么办? Theoretically, it should run quicker than the previous. 从理论上讲,它应该比以前更快。 Interestingly, it runs at just about the same speed as the previous on my machine. 有趣的是,它的速度与我之前的机器上的速度几乎相同。

Maybe it's the extra overhead of sending the worker reference on each time it's called. 也许这是每次调用时发送worker引用的额外开销。 Maybe it's the extra workers being terminated during execution (only one worker won't be terminated before we get the time). 也许这是额外的工人在执行期间被终止(在我们得到时间之前只有一个工人不会被终止)。 Who knows. 谁知道。 Finding this out is a job for another time. 找到这个是另一个时间的工作。

RepeatedThreadPool.html RepeatedThreadPool.html

<!DOCTYPE html>
<html>
<head>
    <title>Repeated Thread Pool</title>
</head>
<body>
    <pre id="debug"></pre>
    <script type="text/javascript">
        var debug = document.getElementById('debug');
        var maxWorkers = navigator.hardwareConcurrency || 4;
        var totalWorkers = 250;
        var index = 0;
        var start = (new Date).getTime();

        function runWorker(worker) {
            worker.onmessage = function(event) {
                debug.appendChild(document.createTextNode('worker.onmessage i = ' + event.data + '\n'));
                if (index < totalWorkers) runWorker(worker);
                else {
                    if(--maxWorkers === 0) alert((new Date).getTime() - start);
                    worker.terminate();
                }
            };
            worker.postMessage(index++); // start the worker.
        }

        for(var i = 0; i < maxWorkers; i++) runWorker(new Worker('genericWorker.js'));
    </script>
</body>
<html>

Now for some real world shtuff 现在为一些现实世界的shtuff

Remember how I said I was using workers to implement 3rd party plugins into my code? 还记得我是怎么说我正在使用工作人员在我的代码中实现第三方插件吗? These plugins have a state to keep track of. 这些插件具有跟踪状态。 I could start the plugins and hope they don't load too many for the application to crash, or I could keep track of the plugin state within my main thread and send that state back to the plugin if the plugin needs to be reloaded. 我可以启动插件并希望它们不会为应用程序崩溃加载太多, 或者我可以跟踪主线程中的插件状态,并在需要重新加载插件时将该状态发送回插件。 I like the second one better. 我更喜欢第二个。

I had written out several more examples of stateful, stateless, and state-restore workers, but I'll spare you the agony and just do some brief explaining and some shorter snippets. 我已经写了几个有状态,无状态和状态恢复工作的例子,但我会免除你的痛苦,只是做一些简短的解释和一些较短的片段。

First-off, a simple stateful worker looks like this: 首先,一个简单的有状态工作者看起来像这样:

StatefulWorker.js StatefulWorker.js

var i = 0;

self.onmessage = function(e) {
    switch(e.data) {
        case 'increment':
            self.postMessage(++i);
            break;
        case 'decrement':
            self.postMessage(--i);
            break;
    }
};

It does some action based on the message it receives and holds data internally. 它根据收到的消息执行一些操作并在内部保存数据。 This is great. 这很棒。 It allows for mah plugin devs to have full control over their plugins. 它允许mah插件开发者完全控制他们的插件。 The main app instantiates their plugin once, then will send messages for them to do some action. 主应用程序实例化他们的插件一次,然后将发送消息给他们做一些动作。

The problem comes in when we want to load several plugins at once. 当我们想要一次加载多个插件时,会出现问题。 We can't do that, so what can we do? 我们做不到,所以我们能做什么?

Let's think about a few solutions. 让我们考虑几个解决方案。

Solution 1 (Stateless) 解决方案1(无状态)

Let's make these plugins stateless. 让我们使这些插件无状态。 Essentially, every time we want to have the plugin do something, our application should instantiate the plugin then send it data based on its old state. 基本上,每次我们想让插件执行某些操作时,我们的应用程序应该实例化插件,然后根据其旧状态发送数据。

data sent 发送的数据

{
    action: 'increment',
    value: 7
}

StatelessWorker.js StatelessWorker.js

self.onmessage = function(e) {
    switch(e.data.action) {
        case 'increment':
            e.data.value++;
            break;
        case 'decrement':
            e.data.value--;
            break;
    }
    self.postMessage({
        value: e.data.value,
        i: e.data.i
    });
};

This could work, but if we're dealing with a good amount of data this will start to seem like a less-than-perfect solution. 这可能有效,但如果我们处理大量数据,这将开始看起来像一个不太完美的解决方案。 Another similar solution could be to have several smaller workers for each plugin and sending only a small amount of data to and from each, but I'm uneasy with that too. 另一个类似的解决方案可能是为每个插件安装几个较小的工作人员,并且每个插件只发送少量数据,但我对此也感到不安。

Solution 2 (State Restore) 解决方案2(状态恢复)

What if we try to keep the worker in memory as long as possible, but if we do lose it, we can restore its state? 如果我们尽可能地让工人留在记忆中会怎么样,但如果我们失去它,我们可以恢复它的状态? We can use some sort of scheduler to see what plugins the user has been using (and maybe some fancy algorithms to guess what the user will use in the future) and keep those in memory. 我们可以使用某种调度程序来查看用户使用的插件(可能还有一些奇特的算法来猜测用户将来会使用什么)并将这些插件保存在内存中。

The cool part about this is that we aren't looking at one worker per core anymore. 关于这一点很酷的部分是我们不再关注每个核心的一名工人了。 Since most of the time the worker is active will be idle, we just need to worry about the memory it takes up. 由于工作人员处于活动状态的大部分时间都是闲置的,我们只需要担心它占用的内存。 For a good number of workers (10 to 20 or so), this won't be substantial at all. 对于大量工人(10到20左右),这根本不会很大。 We can keep the primary plugins loaded while the ones not used as often get switched out as needed. 我们可以保持主插件的加载,而不经常使用的插件可以根据需要进行切换。 All the plugins will still need some sort of state restore. 所有插件仍然需要某种状态恢复。

Let's use the following worker and assume we either send 'increment', 'decrement', or an integer containing the state it's supposed to be at. 让我们使用下面的worker并假设我们发送'increment','decrement'或者包含它应该处于的状态的整数。

StateRestoreWorker.js StateRestoreWorker.js

var i = 0;

self.onmessage = function(e) {
    switch(e.data) {
        case 'increment':
            self.postMessage(++i);
            break;
        case 'decrement':
            self.postMessage(--i);
            break;
        default:
            i = e.data;
    }
};

These are all pretty simple examples, but I hope I helped understand methods of using multiple workers efficiently! 这些都是非常简单的例子,但我希望我能帮助理解有效使用多个工人的方法! I'll most likely be writing a scheduler and optimizer for this stuff, but who knows when I'll get to that point. 我很可能正在为这些东西编写调度程序和优化器,但是谁知道我什么时候能够达到这一点。

Good luck, and happy coding! 祝你好运,编码愉快!

My experience is that too many workers (> 100) decrease the performance. 我的经验是太多工人(> 100)会降低性能。 In my case FF became very slow and Chrome even crashed. 在我的情况下,FF变得非常慢,Chrome甚至崩溃。 I compared variants with different amounts of workers (1, 2, 4, 8, 16, 32). 我将变量与不同数量的工人进行了比较(1,2,4,8,16,32)。 The worker performed an encryption of a string. 工作人员对字符串进行了加密。 It turned out that 8 was the optimal amount of workers, but that may differ, depending on the problem the worker has to solve. 事实证明,8是最佳工人数量,但这可能会有所不同,具体取决于工人必须解决的问题。

I built up a small framework to abstract from the amount of workers. 我建立了一个小框架,从工人数量中抽象出来。 Calls to the workers are created as tasks. 对工作人员的呼叫被创建为任务。 If the maximum allowed number of workers is busy, a new task is queued and executed later. 如果允许的最大工作人员数量很多,则新任务将排队并稍后执行。

It turned out that it's very important to recycle the workers in such an approach. 事实证明,以这种方式回收工人是非常重要的。 You should hold them in a pool when they are idle, but don't call new Worker(...) too often. 您应该在闲置时将它们放在池中,但不要经常调用新的工作者(...)。 Even if the workers are terminated by worker.terminate() it seems that there is a big difference in the performance between creating/terminating and recycling of workers. 即使工人被worker.terminate()终止,似乎创建/终止和回收工人之间的性能也存在很大差异。

The way you're chaining your Workers in the solution #1 impeach the garbage collector to terminate Worker instances because you still have a reference to them in the scope of your onmessage callback function. 你在解决方案#1中链接你的Workers的方式会弹出垃圾收集器来终止Worker实例,因为你仍然在onmessage回调函数的范围内引用它们。

Give a try with this code: 尝试使用此代码:

<script type="text/javascript">
var worker;
$(document).ready(function(){
    createWorker(0);
});
function createWorker(i) {
   worker = new Worker('test.worker.js');
   worker.onmessage = handleMessage;
   worker.postMessage(i); // start the worker.
}
function handleMessage(event) {
       var index = event.data;
       $("#debug").append('worker.onmessage i = ' + index + "<br>");

        if ( index < 25) {
            index++;
            createWorker(index);
        } 
    };
</script>
</head>
<body>
<div id="debug"></div>

Old question, but comes up on a search, so... There Is a configurable limit in Firefox. 老问题,但在搜索上出现,所以......在Firefox中有一个可配置的限制。 If you look in about:config (put as address in FF's address bar), and search for 'worker', you will see several settings, including this one: 如果您查看about:config (在FF的地址栏中输入地址),并搜索“worker”,您将看到几个设置,包括以下设置:

dom.workers.maxPerDomain

Set at 20 by default. 默认设置为20 Double-click the line and change the setting. 双击该行并更改设置。 You will need to restart the browser. 您需要重新启动浏览器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM