I've been researching how to do this and cannot find any answers unfortunately. I would like to find the exact size of a specific web page, or if not possible, then the word/character count of a specific web page's source code.
I am doing this in order to find pages of a web site that exist, as opposed to pages that don't exist via Strings, If Statements, and For Loops by testing out every possible numeric combination after the "/" in front of the website's url, and excluding pages that are below a certain size, which would be pages that load with a 404 error, or does not exist error, which means the only resulting pages would be those that exist. These pages I am looking for have no link to them and cannot be found on a search engine. The only way to get to them is to type the exact number after the "/" to get to it, for example: ( http://website.com/123456789 ) Perhaps this improvised brute force method will find the pages I am looking for. Thanks!
Dim URL As String
'Requesting for file details
Dim req As System.Net.WebRequest = System.Net.HttpWebRequest.Create(URL)
req.Method = "HEAD"
'Retriving the response
Dim resp As System.Net.WebResponse = req.GetResponse()
Dim ContentLength As Long = 0
Dim result As Long
'Finding the file size
If Long.TryParse(resp.Headers.Get("Content-Length"), ContentLength) Then
Dim File_Size As String
If ContentLength >= 1073741824 Then
result = ContentLength / 1073741824
ElseIf ContentLength >= 1048576 Then
result = ContentLength / 1048576
Else
result = ContentLength / 1024
End If
File_Size = result.ToString("0.00")
End If
Check for 404 pages
Private Function RemoteFileOk(ByVal Url As String) As Boolean
Using client As New HttpClient,
responseTask As Task(Of HttpResponseMessage) = client.GetAsync(Url, HttpCompletionOption.ResponseHeadersRead)
responseTask.Wait()
Using response As HttpResponseMessage = responseTask.Result
Return response.IsSuccessStatusCode
End Using
End Using
End Function
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.