[英]How to speed up scraping in Rails
This code works fine locally, but on Heroku it takes more than 30 seconds due to a request timeout: 这段代码在本地运行良好,但是在Heroku上,由于请求超时,它花费了30秒钟以上:
if @url
@arr = Array.new
begin
doc = Nokogiri::HTML(open(@url))
doc.css(".new-cars-results-box").each do |item|
hash = Hash.new
type = item.at_css(".new-car-name").text
link = "http://uae.yallamotor.com"+item.at_css(".new-car-name")[:href]
@arr << [link,type]
end
rescue
end
end
How can I speed this up? 我怎样才能加快速度?
you query the DOM 2 times any result box when you can just query once for all the '.new-car-name', and you creating for each one useless hash 您只需对所有'.new-car-name'进行一次查询就可以对DOM查询任何结果框2次,并且为每个无用的哈希创建
Try this: 尝试这个:
if @url
@arr = Array.new
begin
doc = Nokogiri::HTML(open(@url))
url_prefix = 'http://uae.yallamotor.com'
doc.css(".new-cars-results-box > .new-car-name").each do |item|
type = item.text
link = url_prefix + item[:href]
@arr << [link,type]
end
rescue
end
end
also try to replace this line 也尝试替换这条线
doc.css(".new-cars-results-box > .new-car-name").each do |item|
with this: 有了这个:
doc.css(".new-car-name").each do |item|
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.