简体   繁体   English

让Nokogiri处理延迟的工作

[英]Getting Nokogiri to work with Delayed jobs

I am trying to get Nokogiri to behave when using it with delayed jobs but haven't been very successful so far. 我试图让Nokogiri在延迟工作中使用时表现良好,但到目前为止还没有很成功。

Basically I am trying to run a parsing task in the background, but when the background worker hits my perform method, it fails in the following line: 基本上,我试图在后台运行解析任务,但是当后台工作程序命中我的perform方法时,它在以下行中失败:

HTML_page = Nokogiri::HTML(open('http://www.mysite.com'))

The error message is: 错误消息是:

Nokogiri::HTML::Document#inspect failed with ArgumentError: Requires a Node, NodeSet or String argument, and cannot accept a Delayed::Backend::ActiveRecord::Job. Nokogiri :: HTML :: Document#inspect失败,出现ArgumentError:需要Node,NodeSet或String参数,并且不能接受Delayed :: Backend :: ActiveRecord :: Job。

This happens with both Delayed::Jobs.enqueue and delay methods. 这与Delayed::Jobs.enqueuedelay方法都发生。

If I try the line below in the console, I get the same error: 如果我在控制台中尝试以下行,则会收到相同的错误:

Nokogiri::HTML(open('http://www.mysite.com')).delay

It might be a silly oversight as I am fairly new to Ruby and Rails, so any help would be greatly appreciated. 因为对于Ruby和Rails来说我还很陌生,所以这可能是一个愚蠢的疏忽,所以我们将不胜感激。

Since Nokogiri "Requires a Node, NodeSet or String argument", why not give it one? 由于Nokogiri“需要一个Node,NodeSet或String参数”,为什么不给它一个?

Instead of: 代替:

HTML_page = Nokogiri::HTML(open('http://www.mysite.com'))

try: 尝试:

HTML_page = Nokogiri::HTML(open('http://www.mysite.com').read)

That will cause IO to read the file handle created by open and pass Nokogiri the string content of the URL being read. 这将导致IO读取由open创建的文件句柄,并将所读取URL的字符串内容传递给Nokogiri。

An alternate way to help debug the problem, which I don't think lies within Nokogiri, is to split your command up a bit: 帮助调试问题的另一种方法(我认为不是在Nokogiri中)是将您的命令分开:

body = open('http://www.mysite.com').read
HTML_page = Nokogiri::HTML(body)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM