简体   繁体   English

搜索结果后,屏幕抓取/网络抓取网址不变

[英]Screen scraping/Web scraping url doesn't change after search results

I'm working on a web scraping project and I need my "start" page to start the scraping. 我正在开发一个网络抓取项目,我需要我的“开始”页面来开始抓取。

When navigating to a URL like this: http://www.blah.com/search.aspx and I enter in my data (because I want to pass the search results page to my screen scraping program). 当导航到这样的URL时: http : //www.blah.com/search.aspx ,我输入了数据(因为我想将搜索结果页面传递给屏幕抓取程序)。 When the search results (page) comes back, the URL did not change. 搜索结果(页面)返回时,URL不变。

How can I get the URL from the search results page, or, mimic the "search" to get the search results page in the screen scraping program? 如何从搜索结果页面获取URL,或者模仿“搜索”以在屏幕抓取程序中获取搜索结果页面?

Found this after some more googling (it helps to use the right keywords). 经过更多的谷歌搜索发现了这一点(它有助于使用正确的关键字)。

I was already using HtmlAgilityPack, this just provides more insight on how to initiate the post and read the results. 我已经在使用HtmlAgilityPack,它只是提供了有关如何启动帖子和阅读结果的更多见解。

http://refactoringaspnet.blogspot.com/2010/04/using-htmlagilitypack-to-get-and-post.html http://refactoringaspnet.blogspot.com/2010/04/using-htmlagilitypack-to-get-and-post.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM