简体   繁体   中英

vb.net htmlagilitypack loop of selectnode

i made an array of links

Dim horof As String = "A B C D"
    Dim alphabarray As String() = horof.Split(New Char() {" "c})
    Dim urls() As String = alphabarray.Select(Function(o) "http://somelink/list-" & o).ToArray()

the output is like this

http://somelink/list-A
http://somelink/list-B
http://somelink/list-C
http://somelink/list-D

thin i made webrequest for each link like this :

 For i As Int32 = 0 To urls.Length - 1
        Dim wRequest As WebRequest
        Dim WResponse As WebResponse
        wRequest = FtpWebRequest.Create(urls(i))
        WResponse = wRequest.GetResponse
        Dim SR As StreamReader
        SR = New StreamReader(WResponse.GetResponseStream)
        urls(i) = SR.ReadToEnd
 Next

now i have html source of all the links in array of string that is urls and i want to use htmlagilitypack to selectnodes from each html source in the array

Dim htmlDoc As New HtmlDocument()
htmlDoc.LoadHtml(urls) 
Dim wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")

but it didnt work

i try to pu it in the same loop

        Dim htmlDoc As New HtmlDocument()
        Dim wantednode As HtmlNodeCollection
For i As Int32 = 0 To urls.Length - 1
        Dim wRequest As WebRequest
        Dim WResponse As WebResponse
        wRequest = FtpWebRequest.Create(urls(i))
        WResponse = wRequest.GetResponse
        Dim SR As StreamReader
        SR = New StreamReader(WResponse.GetResponseStream)
        urls(i) = SR.ReadToEnd
        htmlDoc.Load(urls(i))
        wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")
next

this didnt work too how to make loop of wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath") for each htmlcode in the urls array

each html code in urls array are like this that came from

        <body>
          <div class="list_body">

            <ul class="listing">

                <li>
                    <a href="http://wanted1.com" title="">title1 </a>                                               
                     </li>
                <li>
                    <a href="http://wanted2.com" title="">title2  </a>                             
                     </li>
                <li>
                    <a href="http://wanted3.com" title="">title3  </a>                                                
                     </li>
                <li>
                    <a href="http://wanted4.com" title="">title4   </a>                                                                       
                     </li>
                <li>
                    <a href="http://wanted5.com" title="">title5  </a>                                                                                               
                     </li>
                <li>
                    <a href="http://wanted6.com" title="">title6   </a>                                                                                                
                     </li>
            </ul>

          </div>
       </body>

i want http://wanted2.com link in every string in urls

Here is some library code I use:

Public Function Web_Request_Response(URL As String) As String
    Try
        Dim myRequest As HttpWebRequest
        Dim myResponse As HttpWebResponse
        Dim sr As StreamReader
        Dim sResponse As String = ""
        myRequest = CType(WebRequest.Create(URL), HttpWebRequest)
        myResponse = CType(myRequest.GetResponse(), HttpWebResponse)
        sr = New StreamReader(myResponse.GetResponseStream())
        sResponse = sr.ReadToEnd.ToString
        Return sResponse
    Catch ex As Exception
        LogMsgBox(ex, ex.Message, , "WebRequest_Responce Error")
        Return ""
    End Try
End Function

It differs slightly from your calls. Perhaps something to do with the type casting? FTP vs HTTP?

Nice use of Linq.

You can save some typing:

Dim alphabarray As String() = horof.Split(New Char() {" "c})
--- same as ---
Dim alphabarray As String() = horof.Split({" "c})
 or
Dim alphabarray As String() = horof.Split(" ")

You're supposed to use HtmlDocument.LoadHtml() instead of HtmlDocument.Load() , since you want to populate the HtmlDocument from an HTML string :

'urls(i) value has been replaced with HTML string by the following line..
urls(i) = SR.ReadToEnd
'..so next, you need to use `LoadHtml()`
htmlDoc.LoadHtml(urls(i))

wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM