[英]Delayed_Job scraper works in development but not on Heroku
Here's the code for the scraper 这是刮刀的代码
class Scrape
def perform
url = "# a long url"
agent = Mechanize.new
agent.get(url)
while(agent.page.link_with(:text => "Next Page \u00BB")) do
agent.page.search(".content").each do |item|
puts "."
House.create!({
# attributes...
})
end
agent.page.link_with(:text => "Next Page \u00BB").click
end
end
end
On my local environment I can run it in the rails console just by typing 在我的本地环境中,我只需键入即可在rails控制台中运行它
Scrape.new.delay.perform # to queue the job
rake jobs:work
and it works perfectly. 它完美无缺。
However running the analogous (with a worker running instead of rake jobs:work) in the Heroku console doesn't seem to do anything. 然而,在Heroku控制台中运行类似的(运行工作而不是rake作业:工作)似乎没有做任何事情。 I tried logging some lines in the Heroku log and I can get the url variable to log (so the method is at least getting called) but the "." 我尝试在Heroku日志中记录一些行,我可以将url变量记录下来(因此该方法至少被调用)但是“。” which is there to show each time we run the while loop never appears and no Houses are created in the database. 每次我们运行while循环时都会显示,而且数据库中没有创建Houses。
Anyone any ideas what might be wrong? 任何想法有什么可能是错的?
Solved this problem myself, pretty obscure bug though. 我自己解决了这个问题,虽然相当模糊。 I was using ruby 1.9.2 in my local environment but I had the app deployed on a ruby 1.8.7 stack. 我在我的本地环境中使用ruby 1.9.2但是我将应用程序部署在ruby 1.8.7堆栈上。
The important difference being the change in character encoding between the two ruby versions which meant that Mechanize couldn't find a link with the unicode encoded character "\»" and thus didn't do any scraping. 重要的区别是两个ruby版本之间的字符编码的变化,这意味着Mechanize无法找到与unicode编码字符“\\ u00BB”的链接,因此没有做任何抓取。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.