[英]How to get the webpage source code using C#
I know about the WebRequest and the WebResponse objects. 我知道WebRequest和WebResponse对象。 The problem is that I do not really want to get the source code of the webpage, I only want to check to see if the link exists or not.
问题是我真的不想获取网页的源代码,我只想查看链接是否存在。 The thing is, if I use the GetResponse method, it goes an pull the entire source code of the site.
问题是,如果我使用GetResponse方法,它会拉动网站的整个源代码。
I am creating a broken link checker with many links. 我正在创建一个包含许多链接的链接检查器。 It takes quite a while to check them all.
检查它们需要很长时间。 If there a way to to get MINIMAL information from a weblink?
如果有办法从网络链接获取MINIMAL信息? Only enough information to see if the link is valid or broken (not the entire source code).
只有足够的信息来查看链接是有效还是坏(不是整个源代码)。
An answer (BESIDES USING ASYNCHRONOUS TRANSFER) would be greatly appreciated! 一个答案(使用异步转移的BESIDES)将不胜感激!
A standard way of checking the existence of a link is to use a HEAD
request, which causes the remote server to send the headers for the requested object, but not the object itself. 检查链接是否存在的标准方法是使用
HEAD
请求,该请求使远程服务器发送所请求对象的头,但不发送对象本身。 If you thus requested an object that is not on the server, the server gives you the normal 404 response, but if it does exist, you get a 200 response and no data after the headers. 如果您因此请求了不在服务器上的对象,则服务器会为您提供正常的404响应,但如果它存在,则会在标头之后获得200响应并且没有数据。 This way very little uninteresting data goes over the wire.
这种方式很少有无趣的数据通过电线。
WebRequest request = HttpWebRequest.Create("http://www.foo.com/");
request.Method = "HEAD"; // Just get the document headers, not the data.
HEAD
is similar to GET
, only that instead of getting the file contents, we get just the headers. HEAD
类似于GET
,只是不是获取文件内容,而是获取标题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.