简体   繁体   English

Delayed_Job scraper在开发中工作,但在Heroku上没有

[英]Delayed_Job scraper works in development but not on Heroku

Here's the code for the scraper 这是刮刀的代码

class Scrape
  def perform
    url = "# a long url"

    agent = Mechanize.new
    agent.get(url)

    while(agent.page.link_with(:text => "Next Page \u00BB")) do
      agent.page.search(".content").each do |item|
        puts "."
        House.create!({
          # attributes...
        })
      end

      agent.page.link_with(:text => "Next Page \u00BB").click
    end
  end
end

On my local environment I can run it in the rails console just by typing 在我的本地环境中,我只需键入即可在rails控制台中运行它

Scrape.new.delay.perform # to queue the job
rake jobs:work

and it works perfectly. 它完美无缺。

However running the analogous (with a worker running instead of rake jobs:work) in the Heroku console doesn't seem to do anything. 然而,在Heroku控制台中运行类似的(运行工作而不是rake作业:工作)似乎没有做任何事情。 I tried logging some lines in the Heroku log and I can get the url variable to log (so the method is at least getting called) but the "." 我尝试在Heroku日志中记录一些行,我可以将url变量记录下来(因此该方法至少被调用)但是“。” which is there to show each time we run the while loop never appears and no Houses are created in the database. 每次我们运行while循环时都会显示,而且数据库中没有创建Houses。

Anyone any ideas what might be wrong? 任何想法有什么可能是错的?

Solved this problem myself, pretty obscure bug though. 我自己解决了这个问题,虽然相当模糊。 I was using ruby 1.9.2 in my local environment but I had the app deployed on a ruby 1.8.7 stack. 我在我的本地环境中使用ruby 1.9.2但是我将应用程序部署在ruby 1.8.7堆栈上。

The important difference being the change in character encoding between the two ruby versions which meant that Mechanize couldn't find a link with the unicode encoded character "\»" and thus didn't do any scraping. 重要的区别是两个ruby版本之间的字符编码的变化,这意味着Mechanize无法找到与unicode编码字符“\\ u00BB”的链接,因此没有做任何抓取。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM