如何加快Ruby应用程序的速度？

Question

I am making a data intensive web application that I am trying to optimize. 我正在制作要优化的数据密集型Web应用程序。 I've heard of forking and threading, but I have no idea whether they are applicable to what I am trying to do and if so how to implement them. 我听说过分叉和线程化，但是我不知道它们是否适用于我尝试做的事情，以及是否适用于实现它们。 My code looks like this: 我的代码如下所示：

  def search
      @amazon_data=Hash.from_xml(item.retrieve_amazon(params[:sku]))
        unless @amazon_data['results'] == nil
          @amazon_data['results']['item'].size.times do |i|
            @all_books << { :vendor => 'Amazon.com',
                            :price => @amazon_data['results']['item'][i]['price'].to_f,
                            :shipping => @amazon_data['results']['item'][i]['ship'].to_f,
                            :condition => @amazon_data['results']['item'][i]['condition'],
                            :total => @amazon_data['results']['item'][i]['price'].to_f + @amazon_data['results']['item'][i]['ship'].to_f,
                            :availability => 'In Stock',
                            :link_text => 'Go to Amazon.com',
                            :link_url => "http://www.amazon.com/gp/offer-listing/#{params[:isbn]}"
            }
        end
      end
       @ebay_data=Hash.from_xml(Book.retrieve_ebay(params[:sku]))
        unless @ebay_data['results'] == nil
          @ebay_data['results']['item'].size.times do |i|
            @all_books << { :vendor => 'eBay',
                            :price => @ebay_data['results']['item'][i]['price'].to_f,
                            :shipping => @ebay_data['results']['item'][i]['ship'].to_f,
                            :condition => 'Used',
                            :total => @ebay_data['results']['item'][i]['price'].to_f + @ebay_data['results']['item'][i]['ship'].to_f,
                            :availability => 'In Stock',
                            :link_text => 'Go to eBay',
                            :link_url => "http://www.amazon.com/gp/offer-listing/#{params[:sku]}"
            }
        end
    end
  end

So, basically what I have are two actions that retrieve data from eBay and Amazon and parse it here. 因此，基本上，我有两个动作，可以从eBay和Amazon检索数据并在此处进行解析。 How might I make both of these actions run at once? 我如何使这两个动作同时运行？ Do fork or thread have anything to do with what I am trying to accomplish? 叉子或螺纹与我要完成的工作有什么关系吗？

This cuts the API time in half, but I don't know how to return the results. 这样可以将API时间缩短一半，但是我不知道如何返回结果。 The subsequent view loads before the API results are returned.... It is returning data, however. 在返回API结果之前，将加载后续视图。...但是，它正在返回数据。 When I code in 当我编码

puts @all_books

within the thread results are displayed in the console. 线程中的结果将显示在控制台中。 Outside of the thread, however, results are not returned. 但是，在线程之外，不会返回结果。

def search
    Thread.new do
      @amazon_data=Hash.from_xml(item.retrieve_amazon(params[:sku]))
        unless @amazon_data['results'] == nil
          @amazon_data['results']['item'].size.times do |i|
            @all_books << { :vendor => 'Amazon.com',
                            :price => @amazon_data['results']['item'][i]['price'].to_f,
                            :shipping => @amazon_data['results']['item'][i]['ship'].to_f,
                            :condition => @amazon_data['results']['item'][i]['condition'],
                            :total => @amazon_data['results']['item'][i]['price'].to_f + @amazon_data['results']['item'][i]['ship'].to_f,
                            :availability => 'In Stock',
                            :link_text => 'Go to Amazon.com',
                            :link_url => "http://www.amazon.com/gp/offer-listing/#{params[:isbn]}"
            }
        end
      end
     end
    Thread.new do
       @ebay_data=Hash.from_xml(Book.retrieve_ebay(params[:sku]))
        unless @ebay_data['results'] == nil
          @ebay_data['results']['item'].size.times do |i|
            @all_books << { :vendor => 'eBay',
                            :price => @ebay_data['results']['item'][i]['price'].to_f,
                            :shipping => @ebay_data['results']['item'][i]['ship'].to_f,
                            :condition => 'Used',
                            :total => @ebay_data['results']['item'][i]['price'].to_f + @ebay_data['results']['item'][i]['ship'].to_f,
                            :availability => 'In Stock',
                            :link_text => 'Go to eBay',
                            :link_url => "http://www.amazon.com/gp/offer-listing/#{params[:sku]}"
            }
        end
      end
    end
  end

Am I on the right track? 我在正确的轨道上吗？ How can I return the results from within the thread? 如何从线程内返回结果？ Is it that the variable is only accessible within the thread, or does the problem lie in the fact that the program progresses before the results are returned? 是仅在线程内访问该变量，还是问题在于程序在返回结果之前就已经进行了？

Unfortunately the application requires realtime user input to query the APIs. 不幸的是，该应用程序需要实时用户输入来查询API。 The returned data needs to be fresh as it has to do with product pricing in marketplaces...For instance, a user would enter a SKU and with that information the program would make a request to the applicable sites (Amazon and eBay in this case.) Currently it makes the request to Amazon, parses the data, formats it, and then moves on to eBay, parses the data, and formats that. 返回的数据需要新鲜，因为它与市场中的产品定价有关...例如，用户将输入SKU，并使用该信息，程序将向适用的站点（在这种情况下为Amazon和eBay）发出请求。）当前，它向Amazon发出请求，解析数据，对其进行格式化，然后移至eBay，解析数据并对其进行格式化。 Then the formatted data is displayed in the view. 然后，格式化的数据将显示在视图中。

My thought was if I could make those API calls at the same time (on different threads?) it would save time on the web serving end as all that would be required is to parse the returned data and format it correctly. 我的想法是，如果我可以同时（在不同的线程上）进行那些API调用，那将节省Web服务端的时间，因为所需要做的就是解析返回的数据并正确地对其进行格式化。 (Which I might also be able to expedite...) （我也许还可以加快速度...）

Answer 1

Yeah, I still think you'd be better off with a job scheduler in this case. 是的，我仍然认为在这种情况下，使用工作计划程序会更好。 The absolute fastest that an action like this can perform is the slower of the two API requests --- and you have no guarantees about network latency, load on the remote API, etc. Other the other hand, you will have to implement some Javascript code to periodically poll to detect the job completion and inform the user of the results. 这样的操作可以执行的绝对最快的速度是两个API请求中的较慢 ---并且您无法保证网络延迟，远程API上的负载等。另一方面，您将必须实现一些Javascript定期轮询以检测作业完成并将结果通知用户的代码。

Also, thread behavior in ruby 1.8 can be kinda funky at times, especially at scale, so beware. 而且，ruby 1.8中的线程行为有时可能有点时髦，尤其是在规模方面，因此要当心。

Answer 2

It's hard to say without more info, but my suspicion is that waiting for the API responses is where the majority of time is spent. 没有更多信息很难说，但是我怀疑等待API响应是花费大部分时间的地方。

Try a different approach, where the request and processing of the API response is handled in a different process from the web serving process. 尝试使用另一种方法，其中API响应的请求和处理是在与网络服务过程不同的过程中处理的。 The front end code will likely have to periodically poll for results, and inject the results of the operation into the page. 前端代码可能必须定期轮询结果，并将操作结果注入页面。 But win is that the whole request doesn't get backed up waiting for Amazon and Ebay to do their thang. 但胜利的是，整个请求不会得到备份，而无法等待亚马逊和Ebay完成他们的任务。

There are several plugins that can help, delayed_job is a good place to start. 有几个插件可以提供帮助， delay_job是一个不错的起点。

Answer 3

You might also look into EventMachine which allows you to execute your outbound network calls in a non-blocking way. 您可能还会研究EventMachine，它使您能够以非阻塞方式执行出站网络调用。 If you could return the first result to the user, the get the final result over ajax, the user interaction will feel faster. 如果您可以将第一个结果返回给用户，通过ajax获得最终结果，则用户交互会感觉更快。

This is similar to what Kayak.com does with its real-time flight search. 这与Kayak.com的实时航班搜索类似。

You could also consider caching results, returning those to the user quickly, then populating updated results (that you loaded async) via ajax. 您还可以考虑缓存结果，将结果快速返回给用户，然后通过ajax填充更新的结果（已加载异步）。 (you'd have to figure out the right UI for that, maybe just put 'popular' results above the fold, and then latest updates below the fold or something) （您必须为此找到合适的用户界面，也许只是将“受欢迎”的结果放在首位，然后将最新更新放在首位或其他位置）

*EventMachine is complicated * EventMachine很复杂

如何加快Ruby应用程序的速度？

问题描述

3 个解决方案

解决方案1
1 已采纳 2009-08-20 20:54:16

解决方案2
0 2009-08-20 05:56:48

解决方案3
0 2011-12-14 22:13:56

如何加快Ruby应用程序的速度？

问题描述

3 个解决方案

解决方案1 1 已采纳 2009-08-20 20:54:16

解决方案2 0 2009-08-20 05:56:48

解决方案3 0 2011-12-14 22:13:56

解决方案1
1 已采纳 2009-08-20 20:54:16

解决方案2
0 2009-08-20 05:56:48

解决方案3
0 2011-12-14 22:13:56