简体   繁体   English

Scrapy 为 reddit.com 返回 301

[英]Scrapy returning 301 for reddit.com

I'm using reddit as the basis for learning scrappy.我使用 reddit 作为学习scrapy的基础。 It was working fine for a while, but now it always returns a 301 redirect, even when simply calling the shell with "scrapy shell www.reddit.com".它运行良好有一段时间,但现在它总是返回 301 重定向,即使只是使用“scrapy shell www.reddit.com”调用外壳也是如此。 Any ideas how to fix this?任何想法如何解决这一问题?

在 url 中使用 https/http 方案

scrapy shell https://www.reddit.com

I think it's something related to reddit itself as it seems blocking your IP or user agent, you need to try tweaking the following : 1- Raise the DOWNLOAD_DELAY in scrapy settings 2- Try to change your user agent 3- Use proxy with scrapy我认为这与 reddit 本身有关,因为它似乎阻止了您的 IP 或用户代理,您需要尝试调整以下内容:1- 在 scrapy 设置中提高 DOWNLOAD_DELAY 2- 尝试更改您的用户代理 3- 使用代理与 scrapy

For more info on the settings: http://doc.scrapy.org/en/latest/topics/settings.html有关设置的更多信息: http : //doc.scrapy.org/en/latest/topics/settings.html

For scrapy proxy : http://doc.scrapy.org/en/latest/topics/downloader-middleware.html对于scrapy代理: http ://doc.scrapy.org/en/latest/topics/downloader-middleware.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM