简体繁体 English

流星节点进程CPU使用率接近100％

[英]Meteor Node Process CPU Usage Nears 100%

原文 2013-10-19 15:13:46 2 2 node.js/ meteor

I'm having trouble with my Meteor app when it gets to its peak amount of traffic (peak for this is nothing, 1k visits, maybe 2,500 pageviews in a day). 当我的Meteor应用程序达到其高峰流量时，我遇到了问题（这个高峰没什么，1k访问量，一天可能有2,500次浏览量）。 CPU usage spikes and never recovers, so I've taken to using Nodetime to monitor usage and I've been reloading the process ( forever restart ) to get things back to normal. CPU使用率飙升并且永远不会恢复，所以我已经开始使用Nodetime来监控使用情况，并且我一直在重新加载进程（ forever restart ）以使事情恢复正常。

I'm fairly new to profiling, so finding the underlying cause has me at a loss for where to start. 我对分析很新，所以找到根本原因使我无处可去。 I'm fairly certain it has to do with my app's server code, but the profiling seems to point to the Fibers module as a "hotspot" which I understand aids in making my server code synchronous. 我相当肯定它与我的应用程序的服务器代码有关，但分析似乎指向Fibers模块作为“热点”我理解帮助使我的服务器代码同步。

Below is a snippet from the profiling results. 以下是分析结果的片段。 I hope someone can guide me in the right direction in troubleshooting this! 我希望有人可以指导我正确的方向进行故障排除！

在此输入图像描述

2 个解决方案

While I don't have a specific answer to your question, I have experience dealing with CPU issues for our production meteor app for so I can give you a list of things to investigate. 虽然我没有针对您的问题的具体答案，但我有经验处理我们的生产流星应用程序的CPU问题，因此我可以为您提供要调查的事项列表。

Upgrade to the latest version of meteor and the appropriate node version (see the changelog ). 升级到最新版本的meteor和相应的节点版本（请参阅更改日志）。 As of this writing that's meteor 0.8.2 and node 0.10.28. 截至本文撰写时，流星为0.8.2，节点为0.10.28。
Read this and this article. 阅读这和这文章。 The latter makes a great point that you really should always try to delay activation of subscriptions until you need them. 后者非常重要，你应该总是尝试延迟激活订阅，直到你需要它们。 In particular you may not need to publish anything for users who are not logged in. In my experience, meteor CPU problems have everything to do with subscriptions. 特别是您可能不需要为未登录的用户发布任何内容。根据我的经验，流星CPU问题与订阅有关。
Be careful with observe and observeChanges . 注意observe和observe observeChanges 。 These are expensive and are easy to abuse. 这些都很昂贵且易于滥用。 In particular: 特别是：
- Make sure you are calling stop() on your handles when they are no longer needed (consider using a package like publish-with-relations so this is done for you). 确保在不再需要句柄时调用句柄上的stop() （考虑使用像发布一样的包这样就可以了）。
- Fetch only the collections and fields that you absolutely need. 仅获取您绝对需要的集合和字段。 Observe works by continually diffing objects (requires lots of CPU). 观察通过不断区分对象（需要大量CPU）来工作。 The fewer and smaller objects you have, the less there is to compute. 您拥有的对象越少越小，计算的越少。
~~Consider using smart-collections before it is retired .~~ ~~在退役之前考虑使用智能集合。~~ Use oplog tailing - this can make for a night and day difference in performance and CPU usage in your app. 使用oplog拖尾 - 这可以使您的应用程序在性能和CPU使用方面有日夜差异。
Consider making some things not reactive (also mentioned in the articles above). 考虑制作一些不反应的东西（也在上面的文章中提到）。 For us that was a big win. 对我们来说这是一场大胜利。 We had one extremely expensive join that was used on two frequently accessed pages on the site. 我们有一个非常昂贵的连接，用于网站上两个经常访问的页面。 When it got to the point where the CPU was pegged at 100% about every 30 minutes I gave up on reactivity for that element and just did the join on the server and shipped the data to the client via a method call. 当它达到CPU每100分钟大约100％挂钩的程度时，我放弃了该元素的反应性，只是在服务器上进行了连接并通过方法调用将数据发送到客户端。 I also created a server-side expiring cache for these results and stored them by user (special thanks to Matt DeBergalis for this suggestion). 我还为这些结果创建了服务器端过期缓存，并由用户存储（特别感谢Matt DeBergalis提供此建议）。
Do a preventative nightly restart. 做一个预防性的夜间重启。 I have a cron job that tells forever to restart our app once a day in the middle of the night. 我有一个cron作业，告诉forever重新启动我们的应用程序每天一次在半夜。 That brings the CPU down from ~10% to 1%. 这使CPU从大约10％降低到1％。 This seems like black magic, but the fact that the CPU usage changes after a reset leads me to believe this is a good idea. 这看起来像黑魔法，但重置后CPU使用率发生变化这一事实让我相信这是一个好主意。

Updated thoughts (1/13/14) 更新的想法（1/13/14）

We migrated to oplog tailing as soon as it was available (meteor 0.7) and that made a big difference. 我们尽快迁移到oplog尾随（流星0.7）并且这产生了很大的不同。 Note that in order to get access to the oplog, you'll probably need to either host your own db or run a dedicated instance on the hosting provider of your choice. 请注意，为了访问oplog，您可能需要托管自己的数据库或在您选择的主机提供程序上运行专用实例。 I'd also recommend adding the facts package to actually tell if its working. 我还建议添加facts包来实际告诉它是否正常工作。
There was a memory leak discovered in publish-with-relations , and as of this writing the atmosphere version (v0.1.5) hasn't been bumped to reflect these changes. 在publish-with-relations发现了内存泄漏，在撰写本文时，大气版本（v0.1.5）并没有被反映出来以反映这些变化。 If you are using it in production, I strongly recommend checking out the HEAD version and running it locally . 如果您在生产中使用它，我强烈建议您查看HEAD版本并在本地运行它。
We stopped doing nightly restarts a couple of weeks ago. 几个星期前我们停止了夜间重启。 So far everything has been fine (fingers crossed). 到目前为止一切都很好（手指交叉）。

Updated thoughts (7/2/14) 更新的想法（7/2/14）

A few months ago we switched over to using an Elastic Deployment on mongohq . 几个月前，我们转而在mongohq上使用Elastic Deployment。 It's affordable, the performance has been great, and they even have a blog post which tells you how to enable oplog tailing. 它价格实惠，性能一直很好，甚至还有一篇博客文章，告诉你如何启用oplog拖尾。
I'd strongly recommend checking out kadira to help diagnose performance issues in your app. 我强烈建议您查看kadira以帮助诊断应用中的性能问题。 Also check out the academy articles which have a number of good tips in them. 另请查看学院文章，其中包含许多好的技巧。

I'm also having this problem. 我也有这个问题。 Actually there is an issue with 0.6.6.1 , I run meteor --release 0.6.6 and the cpu is back to normal now. 实际上0.6.6.1存在问题，我运行meteor --release 0.6.6并且cpu现在恢复正常。