简体   繁体   English

Python web 刮板和输入

[英]Python web scraper and input

I had started to build a program for my personal work use of web scraping and input for mortgage rates.我已经开始为我的个人工作使用 web 抓取和输入抵押贷款利率构建一个程序。 Essentially what i wanted to do was have my program log into each website, enter the mortgage data necessary, and it would return rates and compare each site so that i wouldnt have to manually do this on each site.基本上我想做的是让我的程序登录到每个网站,输入必要的抵押数据,它会返回利率并比较每个网站,这样我就不必在每个网站上手动执行此操作。

The problem i didnt think of is the login portion.我没有想到的问题是登录部分。 i would have to store tokens and a few other items in order for me to navigate from page to page within each website.我将不得不存储令牌和一些其他项目,以便我在每个网站内从一个页面导航到另一个页面。

my question is, is this even possible since i dont know the credentials/tokens to send to each page within a site?我的问题是,这是否可能,因为我不知道要发送到站点内每个页面的凭据/令牌? (i have the login info but unsure if i need more than just the credentials and the tokens) (我有登录信息,但不确定我是否需要的不仅仅是凭据和令牌)

This is complicated with just the request module.这对于请求模块来说很复杂。

Note that this approach requires more system resources请注意,此方法需要更多系统资源

You can use PlayWright to control a chromium instance.您可以使用PlayWright来控制 Chromium 实例。

Chromium saves the credentials and tokens like nearly every other browser and you just have to program the browser to login and scrape. Chromium 像几乎所有其他浏览器一样保存凭据和令牌,您只需对浏览器进行编程即可登录和抓取。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM