简体   繁体   English

使用ruby机械化刮aspx站点时出错。 机械化:: ResponseCodeError:404 => Net :: HTTPNotFound

[英]Error scraping aspx site with ruby Mechanize. Mechanize::ResponseCodeError: 404 => Net::HTTPNotFound

I'm trying to scrape a ratings website with Ruby's mechanize, and am having a world of trouble. 我正在尝试使用Ruby的机械化方法来刮除评级网站,并遇到了很多麻烦。 My code is pretty simple: 我的代码很简单:

require "mechanize"
@client.get("http://cape.ucsd.edu/responses/Results.aspx")

At that point, you'll see the 404 errors. 届时,您将看到404错误。

I've tried a few things, including HTTParty searching for redirects; 我尝试了一些事情,包括HTTParty搜索重定向; disabling SSL checking; 禁用SSL检查; even saving the html file locally (to get the proper query form), and then trying to issue it directly from an agent connected to the main site. 甚至将html文件保存在本地(以获取正确的查询表单),然后尝试直接从连接到主站点的代理发出该文件。 All of these lead to the same error. 所有这些导致相同的错误。

I'm fairly new to scraping, and I'm hoping I'm doing something silly. 我对抓取还很陌生,希望自己做的事很傻。 Any help would be appreciated. 任何帮助,将不胜感激。

Yes, it's user agent. 是的,它是用户代理。 To set the user agent do: 要设置用户代理,请执行以下操作:

@client = Mechanize.new
@client.user_agent = 'Mozilla'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM