简体繁体 English

Node.js应该用于密集处理吗？

[英]Should Node.js be used for intensive processing?

原文 2012-11-12 20:59:35 7 1 javascript/ node.js/ mongodb/ webserver/ scalability

Let's say I'm building a 3-tier web site, with Mongo DB on the back end and some really lightweight javascript in the browser (let's say just validation on forms, maybe a couple of fancy controls which fire off some AJAX requests). 假设我正在构建一个3层网站，后端有Mongo DB，浏览器中有一些非常轻量级的javascript（假设只是对表单进行验证，可能还有几个奇特的控件可以触发一些AJAX请求）。

I need to choose a technology for the 'middle' tier (we could segment this into sub-tiers but that detail isn't the focus here, just the overall technology choice), where I want to crunch some raw data coming out of the DB, and render this into some HTML which I push to the browser. 我需要为“中间”层选择一种技术（我们可以将其划分为子层，但这个细节不是重点，只是整体技术选择），我想要处理来自DB，并将其呈现为一些HTML，我将其推送到浏览器。 A fairly typical thin-client web architecture. 一个相当典型的瘦客户端Web架构。

My safe choice would be to just implement this middle tier in Java, using some libraries like Jongo to talk to the Mongo DB and maybe Jackson to marshal/unmarshal JSON to talk to my fancy controls when they make AJAX requests. 我的安全选择是在Java中实现这个中间层，使用像Jongo这样的库来与Mongo DB交谈，也许杰克逊可以编组/解组JSON，以便在他们发出AJAX请求时与我的花哨控件交谈。 And some Java templating framework for rendering my HTML on the server. 还有一些Java模板框架用于在服务器上呈现我的HTML。

However, I'm really intrigued by the idea of throwing all this out the window and using Node.js for this middle tier, for the following reasons: 但是，我真的很感兴趣的是将所有这些抛出窗口并将Node.js用于此中间层，原因如下：

I like javascript (the good parts), and let's say for this application's business logic it would be more expressive than Java. 我喜欢javascript（好的部分），让我们说这个应用程序的业务逻辑比Java更具表现力。
It's javascript everywhere. 到处都是javascript。 No need to switch between languages, and indeed the OO and functional paradigms, when working anywhere on the stack. 在堆栈中的任何地方工作时，无需在语言之间切换，实际上也不需要在OO和功能范例之间切换。 There's no translation plumbing between the tiers, JSON is supported natively everywhere. 层之间没有转换管道，JSON在本地支持。
I can reuse validation logic on the client and server. 我可以在客户端和服务器上重用验证逻辑。
If in the future I decide to do the HTML rendering client-side in the browser, I can reuse the existing templates with something like Backbone with a pretty minimal refactoring / retesting effort. 如果将来我决定在浏览器中进行HTML呈现客户端，我可以重用现有模板，例如Backbone，只需极少的重构/重新测试工作。

If you're at this point and like Node, all the above will seem obvious. 如果你现在和Node一样，以上所有内容都会显而易见。 So I should choose Node right? 所以我应该选择Node吧？

BUT... this is where it falls down for me: as we all know Node is based around a single-threaded async I/O web server model. 但是......这就是我失败的地方：因为我们都知道Node基于单线程异步I / O Web服务器模型。 This is great for my scalability and performance in terms of servicing requests for data, but what about my business logic? 这对于我在数据请求服务方面的可扩展性和性能非常有用，但我的业务逻辑呢？ What about my template rendering? 我的模板渲染怎么样？ Won't this stuff cause a huge bottleneck for all requests on the single thread? 这个东西不会对单线程上的所有请求造成巨大的瓶颈吗？

Two obvious solutions come to mind, but neither of them sits right: 我想到了两个明显的解决方案，但它们都不正确：

Keep the 'blocking' business logic in there and just use a cluster of Node instances and a load balancer, to service requests in true parallel. 将“阻塞”业务逻辑保留在那里，只使用一组Node实例和一个负载均衡器，以真正的并行方式为请求提供服务。 Ok great, so why isn't Node just multi-threaded in the first place? 好的，那么为什么Node首先不是多线程的呢？ Or was this always the idea, to Keep It Simple Stupid and avoid the possibility of multi-threaded complexity in the base case, making the programmer do the extra setup work on top of this if multi-core processing power is desired? 或者这总是想法，保持简单愚蠢并避免基本情况下多线程复杂性的可能性，如果需要多核处理能力，程序员可以在此基础上进行额外的设置工作吗？
Keep a single node instance, and keep it non-blocking by just calling out to some java implementation of my business logic running on some other, muti-threaded, app server. 保留单个节点实例，并通过调用在其他多线程应用服务器上运行的业务逻辑的某些java实现来保持其不阻塞。 Ok, this option completely nullifies every pro I listed of using Node (in fact it adds complexity over just using Java), other than the possible gains in performance and scalability for CRUD requests to the DB. 好吧，这个选项完全取消了我列出的使用Node的所有专业人员（实际上它增加了仅使用Java的复杂性），除了可能获得的对数据库的CRUD请求的性能和可伸缩性。

Which leads me finally to the point of my question - am I missing some huge important piece of the Node puzzle, have I just got my facts completely wrong, or is Node just unsuitable for crunching business logic on the server? 这最终导致了我的问题 - 我是否遗漏了Node难题的一些重要部分，我是否完全错了我的事实，或者Node是否不适合在服务器上处理业务逻辑？ Put another way, is Node just useful for sitting over a database and servicing many CRUD requests in a more performant and scalable way than some other implementation which blocks on I/O? 换句话说，Node是否有用于坐在数据库上并以比I / O阻塞的其他实现更高性能和可扩展的方式为许多CRUD请求提供服务？ And you have to do all your business logic in some tier below, or even client-side, to maintain any reasonable levels of performance and scalability? 您必须在以下某个层甚至客户端执行所有业务逻辑，以维持任何合理的性能和可伸缩性级别？

Considering all the buzz over Node, I'd rather hoped it brought more to the table than this. 考虑到Node上的所有嗡嗡声，我宁愿希望它带来更多的信息。 I'd love to be convinced otherwise! 我更乐意相信！

1 个解决方案

On any given system you have N cpus available (1-64, or whatever N happens to be). 在任何给定的系统上，你有N cpus可用（1-64，或者恰好是N ）。 In any CPU-intensive application, you're going to be stuck with a throughput of N cpus. 在任何CPU密集型应用程序中，您将遇到N cpu的吞吐量。 There's no magical way to fix that by adding more than N threads/processes/whatever. 通过添加超过N个线程/进程/任何东西，没有神奇的方法可以解决这个问题。 Either your code has to be more efficient, or you need more CPUs. 您的代码必须更高效，或者您需要更多的CPU。 More threads won't help. 更多线程无济于事。

One of the little-appreciated facts about multiple-CPU performance is that if you need to run N+1 CPU-intensive operations at the same time, your throughput per CPU goes down quite a bit. 关于多CPU性能的一个鲜为人知的事实是，如果您需要同时运行N + 1个 CPU密集型操作，则每个CPU的吞吐量会下降很多。 A CPU-intensive process tends to hang on to that CPU for a very long time before giving it up, starving the other tasks pretty badly. CPU耗尽的过程往往会在放弃之前长时间挂在CPU上，导致其他任务非常糟糕。 In the majority of cases, it is blocking I/O and the concomitant task-switching that makes modern OS multitasking work as well as it does. 在大多数情况下，它阻止了I / O和随之而来的任务切换，这使得现代操作系统多任务工作和它一样有效。 If more of our every-day common tasks were CPU-bound, we would discover we needed a lot more CPUs in our machines than we do currently. 如果我们的日常常见任务中有更多是受CPU限制的，我们会发现我们的机器中需要的CPU比现在多得多。

The nice thing that Node.js brings to the server party efficiency-wise is a thorough use of each thread. Node.js在服务器方面提高效率的好处在于彻底使用每个线程。 Ideally, you end up with less task switching. 理想情况下，您最终会减少任务切换。 This isn't a huge win, but having N threads handling N*C connections asynchronously is going to have a performance advantage over N*C blocking threads running on the same number of CPUs. 这不是一个巨大的胜利，但是N个线程异步处理N * C连接将比在相同数量的CPU上运行的N * C阻塞线程具有性能优势。 But the bottom line on CPUs remains the same: if you have more than N worth of actual CPU work to be done, you're going to feel some pain. 但是CPU的底线仍然是相同的：如果你有超过N个值的实际CPU工作要做，你会感到有些痛苦。

The last time I looked at the Node.js API there was a way to launch a server with one listener plus one worker thread per CPU. 我最后一次查看Node.js API时，有一种方法可以启动一个服务器，每个CPU只有一个监听器和一个工作线程。 If you can do that, I would be inclined to go with Node.js provided a few caveats are met: 如果你能做到这一点，我会倾向于使用Node.js提供一些警告：

The Javascript-everywhere approach buys you some simplicity. Javascript-无处不在的方法为您带来了一些简单性。 For something complicated, I would be concerned about the asynchronous programming style making things harder rather than easier. 对于复杂的事情，我会担心异步编程风格使事情变得更难而不是更容易。
The template-processing and other CPU-intensive tasks aren't appreciably slower in Node.js than your other language/platform choices. 与其他语言/平台选择相比，Node.js中的模板处理和其他CPU密集型任务的速度并不慢。
The database drivers are reliable. 数据库驱动程序是可靠的。