简体   繁体   English

POST请求后使用Python解析网站

[英]Parse a website after POST request with Python

I want to scrap some addresses off of a store locator for a website to make it easier to put into google maps and see where all the stores in my state are located. 我想从网站的商店定位器中删除一些地址,以便更轻松地放入Google地图中,并查看我所在州的所有商店的位置。

Here is the store locator website: http://www.rockyboots.com/locator 这是商店定位器网站: http : //www.rockyboots.com/locator

When I go here, I put in my state, and it returns all the addresses. 当我去这里时,我进入我的状态,它返回所有地址。 I am wondering how to do this with python, specifically the Requests module. 我想知道如何使用python(特别是Requests模块)做到这一点。

I tried to reverse engineer what information was being sent when I submitted the form, and got 提交表单后,我试图对工程师发送的信息进行反向工程,

form = {'dwfrm_storelocator_address_states_stateUSCA':'NS', 
        'dwfrm_storelocator_findbystate':'Search'}

so my request currently looks like: 所以我的要求目前看起来像:

r = requests.post(url, data = form)

where 哪里

url = 'http://www.rockyboots.com/locator'

This seems to just give me the locator page again, not the submitted form page. 这似乎只是再次给了我定位器页面,而不是提交的表单页面。

At this point, I am just not sure what else to do or try, so any info would be helpful. 在这一点上,我只是不确定要做什么或尝试什么,所以任何信息都会有所帮助。 This is just a side project that I could use to learn some web scraping. 这只是一个附带项目,我可以用来学习一些Web抓取。

A user on Reddit ( commandlineluser ) helped me with the solution. Reddit上的一个用户( commandlineluser )为我提供了解决方案。 Turns out, there was another parameter that had to be passed to the POST request, but that parameter changed every time you access the URL. 原来,还有另一个参数必须传递给POST请求,但是每次您访问URL时,该参数都会更改。

The final solution is shown here. 最终解决方案如下所示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM