简体   繁体   English

HttpwebRequest模拟点击

[英]HttpwebRequest Simulate Click

I was working on httpwebrequest and was trying to search google get result and simulate click to desired link. 我正在研究httpwebrequest,并尝试搜索google获取结果并模拟对所需链接的点击。 Is that possible? 那可能吗?

 string raw ="http://www.google.com/search?hl=en&q={0}&aq=f&oq=&aqi=n1g10";
string search = string.Format(raw, HttpUtility.UrlEncode(searchTerm));
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(search);
request.Proxy = prox;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.ASCII))
{
HtmlElementCollection html = reader.ReadToEnd();
browserA=reader.ReadToEnd();
this.Invoke(new EventHandler(IE1));
}
}

You could parse the page using http://htmlagilitypack.codeplex.com/ or http://www.justagile.com/linq-to-html.aspx (also you may use Regexps if needed in conjunction with this tools) to find elements you want to "Click" and then process HttpWebRequest with this new elements. 您可以使用http://htmlagilitypack.codeplex.com/http://www.justagile.com/linq-to-html.aspx解析页面(如果需要,也可以使用Regexps与此工具一起使用)来查找您想要“单击”的元素,然后使用这些新元素处理HttpWebRequest。 It is calling http://en.wikipedia.org/wiki/Web_scraping . 它正在调用http://en.wikipedia.org/wiki/Web_scraping

Also you should remember that resource which you web scraping may ban your IP address if a lot of requests coming from your IP address, to avoid that you need to think about using list of proxy servers. 另外,您还应该记住,如果很多请求来自您的IP地址,则网络抓取的资源可能会禁止您的IP地址,以避免您需要考虑使用代理服务器列表。

A better option is to use one of google's APIs. 更好的选择是使用Google的API之一。

There is a list of all of them here: Google APIs 以下是所有列表: Google API

Here is another on codeplex: Google Dot Net 这是Codeplex上的另一个: Google Dot Net

They have services that allow applications to use google freely. 他们提供的服务允许应用程序自由使用Google。 With most of these there are wsdl files you can use to "Add Web Reference" in Visual Studio. 对于其中的大多数,都可以使用wsdl文件来在Visual Studio中“添加Web引用”。

Using Regex and HtmlAgility pack should only be used as a last resort when a website does not expose public services (I had to use it recently for something I'm writing to integrate to uTorrent and BtJunkie). 当网站不公开公共服务时,使用Regex和HtmlAgility包只能作为最后的手段(最近我必须使用它来处理我正在编写的与uTorrent和BtJunkie集成的内容)。 Google obviously wants people to develop with their sites in these ways. 谷歌显然希望人们以这些方式开发他们的网站。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM