[英]How can I manually authenticate before scrapy runs?
I want to scrape a web page that uses a ridiculous quantity of captcha challenges before I can login (eg more than 20 challenges in sequence). 我想在我登录之前抓取一个使用可笑的验证码数量的网页(例如顺序超过20个挑战)。
How can I login, by me solving the captcha, with my physical hands, ie not with Selenium etc., and then have the web scraping run. 如何解决物理验证码,如何用我的双手(即不使用Selenium等)登录,然后运行网络抓取。 I have tried finding code that does the same in Scrapy documentation, tutorials and web searching and found nothing.
我尝试过在Scrapy文档,教程和网络搜索中找到与上述代码相同的代码,却一无所获。
Obligatory code that doesn't do the thing that I am asking how to do: 强制性代码没有执行我要问的事情:
import scrapy
class BadSpider(scrapy.Spider):
name = "bad"
def start_requests(self):
[...]
def parse(self, response):
if (response.url.endswith('/login')):
print('!!!!! I have no idea what to do here!!!!')
else:
[...]
I want it to start after I have manually authenticated. 我希望它在手动验证后启动。 But, instead it starts and I have not logged in so I can not go further.
但是,它开始了,但是我还没有登录,所以我不能再走了。
Copy as cURL (bash)
option Copy as cURL (bash)
选项 PS: I would suggest perform this action in Mozilla Firefox, because sometimes Chrome's DevTools produces incorrect results in https://curl.trillworks.com/ PS:我建议在Mozilla Firefox中执行此操作,因为有时Chrome的DevTools在https://curl.trillworks.com/中会产生错误的结果
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.