简体   繁体   中英

How can I use Visual Basic to find the size of a specific web page?

I've been researching how to do this and cannot find any answers unfortunately. I would like to find the exact size of a specific web page, or if not possible, then the word/character count of a specific web page's source code.

I am doing this in order to find pages of a web site that exist, as opposed to pages that don't exist via Strings, If Statements, and For Loops by testing out every possible numeric combination after the "/" in front of the website's url, and excluding pages that are below a certain size, which would be pages that load with a 404 error, or does not exist error, which means the only resulting pages would be those that exist. These pages I am looking for have no link to them and cannot be found on a search engine. The only way to get to them is to type the exact number after the "/" to get to it, for example: ( http://website.com/123456789 ) Perhaps this improvised brute force method will find the pages I am looking for. Thanks!

   Dim URL As String 
    'Requesting for file details
      Dim req As System.Net.WebRequest = System.Net.HttpWebRequest.Create(URL)
      req.Method = "HEAD"

        'Retriving the response
      Dim resp As System.Net.WebResponse = req.GetResponse()
      Dim ContentLength As Long = 0
      Dim result As Long

        'Finding the file size
      If Long.TryParse(resp.Headers.Get("Content-Length"), ContentLength) Then
        Dim File_Size As String

        If ContentLength >= 1073741824 Then
          result = ContentLength / 1073741824

        ElseIf ContentLength >= 1048576 Then
          result = ContentLength / 1048576

        Else
          result = ContentLength / 1024

        End If
        File_Size = result.ToString("0.00")

      End If

Check for 404 pages

Private Function RemoteFileOk(ByVal Url As String) As Boolean
    Using client As New HttpClient,
        responseTask As Task(Of HttpResponseMessage) = client.GetAsync(Url, HttpCompletionOption.ResponseHeadersRead)
        responseTask.Wait()
        Using response As HttpResponseMessage = responseTask.Result
            Return response.IsSuccessStatusCode
        End Using
    End Using
End Function

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM