简体繁体 English

后台作业需要两倍于rails内相同操作的时间

[英]Background job taking twice the time that the same operation within rails

原文 2014-07-02 17:50:57 4 3 ruby-on-rails/ delayed-job

In my Rails application, I have a long calculation requiring a lot of database access. 在我的Rails应用程序中，我有一个很长的计算需要大量的数据库访问。

To make it short, my calculation took 25 seconds. 为了缩短它，我的计算耗时25秒。

When implementing the same calculation within a background job (a big single worker), the same calculation take twice the same time (ie 50 seconds). 在后台作业（一个大的单个工作者）中实现相同的计算时，相同的计算需要两倍的相同时间（即50秒）。 I have try several technics to put the job in a background process put none add an impact on my performances => using DelayJob / Sidekiq / doing the process within my rails but in a thread created for the work, but all have the same impact on my performances *2. 我已经尝试了几种技术来将工作放在后台进程中，没有对我的性能产生影响=>使用DelayJob / Sidekiq /在我的rails中执行该过程但是在为工作创建的线程中，但是所有都具有相同的影响我的表现* 2。

This performance difference only exist in rails 'production' environment. 这种性能差异仅存在于rails'生产'环境中。 It looks like there is an optimisation done by rails that is not done in my background job. 看起来有一个由rails完成的优化，这在我的后台作业中没有完成。

My technical environment is the following => 我的技术环境如下=>

I am using ruby 2.0 / rails 4 我正在使用ruby 2.0 / rails 4
I am using unicorn (but I have same problem without it). 我正在使用独角兽（但没有它我有同样的问题）。
The job is using Rails.cache to store some partial computation. 这项工作是使用Rails.cache来存储一些部分计算。
I am using postgresql 我正在使用postgresql

Does anybody has an clue where this impact might come from ? 有没有人知道这种影响可能来自哪里？

3 个解决方案

I'm assuming you're comparing the background job speed to the speed of running the operation during a web request? 我假设您将后台作业速度与Web请求期间运行操作的速度进行比较？ If so, you're likely benefiting from Rails's QueryCache , which caches db queries during a web request. 如果是这样，您可能会受益于Rails的QueryCache ，后者在Web请求期间缓存数据库查询。 Try disabling it like described here: 尝试禁用它，如下所述：

Disabling Rails SQL query caching globally 全局禁用Rails SQL查询缓存

If that causes the web request version of the job to take as long as the background job, you've found your culprit. 如果这导致作业的Web请求版本与后台作业一样长，那么您就找到了罪魁祸首。 You can then enable the query cache on your background job to speed it up (if it makes sense for your application). 然后，您可以在后台作业上启用查询缓存以加快速度（如果它对您的应用程序有意义）。

Background job is not something that need to used for speed-up things. 后台工作不是需要用于加速事情的东西。 It's main meaning is to 'fire and forget' and remove 25 seconds of calculating synchronously and adding some more of calculating asynchronously. 它的主要意义是“发射并忘记”并同步删除25秒计算并异步添加更多计算。 So you can give user response that she's request is processing and return with calculation later. 因此，您可以提供她请求正在处理的用户响应，并在以后计算时返回。

You may take speed gain from background job by splitting big task on some small and running them at same time. 您可以通过将大任务分成一些小任务并同时运行来从后台工作中获取速度增益。 In your case I think it's something impossible to use, because of dependency of operations in yours calculation. 在你的情况下，我认为这是不可能使用的，因为在你的计算中依赖于操作。

So if you want to speed you calculation, you need to look into denormalization of your data structure, storing some calculated values for your big calculation on moment when source data for this calculation updated. 因此，如果您想加快计算速度，则需要研究数据结构的非规范化，并在更新此计算的源数据时为您的大计算存储一些计算值。 So you will calculate less on user request for results and more on data storage. 因此，您将根据用户对结果的请求计算较少，而对数据存储的计算更多。 And it's good place for use background job. 这是使用后台工作的好地方。 So you finish your update of data, create background task for update caches. 因此，您完成数据更新，为更新缓存创建后台任务。 And if user request for calculation comes before this task is finished you will still need to wait for cache fill-up. 如果用户在此任务完成之前请求计算，您仍需要等待缓存填充。

Update: I think I am still need to answer your main question. 更新：我想我仍然需要回答你的主要问题。 So basically this additional time on background task processing is comes from implementation. 所以基本上这个后台任务处理的额外时间来自实现。 Because of 'fire and forget' approach no one need that background task scheduler will consume big amount of processor time just monitoring for new jobs. 由于“火灾和遗忘”的方法，没有人需要后台任务调度程序将消耗大量的处理器时间来监控新的工作。 I am not sure completely but think that if your calculation will be two times more complex, time gain will be same 25 seconds. 我不完全确定，但如果你的计算会复杂两倍，那么时间增益将是25秒。

My guess is that the extra time is coming from the need for your background worker to load rails and all of your application. 我的猜测是，额外的时间来自后台工作人员加载rails和所有应用程序的需要。 My clue is that you said the difference was greatest with Rails in production mode. 我的线索是，你说在生产模式下Rails的差异最大。 In production mode, subsequent calls to the app make use of the app and class cache. 在生产模式下，对应用程序的后续调用使用app和类缓存。

How to check this hypotheses: 如何检查这个假设：

Change your background job to do the following: 更改后台作业以执行以下操作：

print a log message before you initiate the worker 在启动工作程序之前打印日志消息
start the worker 启动工人
run your calculation. 运行你的计算。 As part of your calculation startup, print a log message 作为计算启动的一部分，打印日志消息
print another log message 打印另一条日志消息
run your calculation again 再次运行您的计算
print another log message 打印另一条日志消息

Then compare the two times for running your calculation. 然后比较两次运行计算。

Of course, you'll also gain some extra time benefits from database caching, code might remain resident in memory, etc. But if the second run is much much faster, then the fact that the second run didn't restart Rails is more significant. 当然，你也可以从数据库缓存中获得额外的时间好处，代码可能仍然驻留在内存中等等。但是如果第二次运行速度要快得多，那么第二次运行没有重启Rails的事实就更为重要。

Also, the time between the log message from steps 1 and 3 will also help you understand the start up times. 此外，步骤1和3之间的日志消息之间的时间也将帮助您了解启动时间。

Fixes 修复

Why wait? 干嘛要等？ Most important: why do you need the results faster? 最重要的是：为什么你需要更快的结果？ Eg, tell your user that the result will be emailed to them after it is calculated. 例如，告诉您的用户结果将在计算后通过电子邮件发送给他们。 Or let your user see that the calculation is proceeding in the background, and later, show them the result. 或者让您的用户看到计算在后台进行，然后向他们显示结果。

The key for any long running calculation is to do it in the background and encourage the user to not wait for the result. 任何长时间运行计算的关键是在后台进行，并鼓励用户不要等待结果。 They should be able to do something else until they get the result. 他们应该能够做其他事情，直到他们得到结果。

Start the calculation automatically As soon as the user logs in, or after they do something interesting, start the calculation. 自动开始计算一旦用户登录，或在他们做了一些有趣的事情后，就开始计算。 That way, when (and if) the user asks for the calculation, the answer will either be already done or will soon be done. 这样，当（和如果）用户要求计算时，答案将要么已经完成，要么很快就会完成。

Cache the result and bust the cache as needed Similar to the above, start the calculation periodically and automatically. 根据需要缓存结果并破坏缓存与上面类似，定期自动开始计算。 If the user changes some data, then restart the calculation by busting the cache. 如果用户更改了某些数据，则通过破坏缓存重新开始计算。 There are also ways to halt any on-going calculation if data is changed during the calculation. 如果在计算过程中更改了数据，还有一些方法可以暂停任何正在进行的计算。

Pre-calculate part of the calculation Why are you taking 25 seconds or more for a dbms calculation? 预先计算部分计算为什么要花费25秒或更多时间进行dbms计算？ Could be that you should change the calculation. 可能是你应该改变计算。 Investigate adding indexes, summary tables, de-normalizing, splitting the calculation into smaller steps that can be pre-calculated, etc. 调查添加索引，汇总表，反规范化，将计算拆分为可预先计算的较小步骤等。