简体   繁体   中英

Extracting value in a webpage Table Row to a string using VB.NET

I should preface this by letting you know I'm not a coder. I'm just a person who knows a little VB.NET and likes efficiency.

I'm working with WebBrowser1.Document.Body.InnerHtml to get the source of the webpage within the webbrowser element, inside the source is this Table Row

<tr>
    <td width="60px">
        11/04/18    </td>
    <td width="80px">
        John Smith  </td>
    <td>
        CHARGED_ONBOARDING_FEE - Admin manual charged   </td>
</tr>

I can easily check if CHARGED_ONBOARDING_FEE appears in the page with this:

i = WebBrowser1.Document.Body.InnerHtml   

If i.Contains("CHARGED_ONBOARDING_FEE") Then

    RichTextBox1.AppendText("OB PAID" & vbNewLine)

    Else

    RichTextBox1.AppendText("NO FEE" & vbNewLine)

    End If

However, is there anyway I can extract that date (11/04/18)?

Is it possible to have this workflow or something similar?

 1. if exists CHARGED_ONBOARDING_FEE proceed

 2. Check backward in string for <td width="60px"> if exists proceed

 3. date1 = string between "60px"> and </td>

 4. RichTextBox1.AppendText("OB PAID" & " on " & date1 & vbNewLine)

Thanks for any help guys, please go easy on me!

I suggest you to go with external library : HTML Agility pack

You can find bunch of examples on link : Examples

Dim htmlFile as new HtmlDocument 
htmlFile .LoadHtml("YourHtmlCode")  
Dim htmlNodes = htmlFile.DocumentNode.SelectNodes("//tr/td")

For Each noe In htmlNodes    
     MsgBox(noe.innerhtml) 
Next

Base on the Jimi's idea:

Dim date1 as string
Dim textExistOrNot as Boolean = false

'get collection of all tr in the webpage
For Each trSect As HtmlElement In WebBrowser1.Document.GetElementsByTagName("tr")

    If trSect.innerText Is Nothing Then

    Else
         'get the tr which has the text "CHARGED_ONBOARDING_FEE" inside it (including its children)
         If trSect.innerText.Contains("CHARGED_ONBOARDING_FEE") then

              'the first child is <td width="60px"> 11/04/18 </td>, it is item(0)
              'the second child is <td width="80px"> John Smith </td>, it is item(1)
              'the third child is <td> CHARGED_ONBOARDING_FEE - Admin manual charged </td>, it is item(2)
              date1 = trSect.Children.item(0).innerText
              RichTextBox1.AppendText("OB PAID" & " on " & date1 & vbNewLine)
              textExistOrNot = true

         End if

    End if   

Next

If textExistOrNot is false then

    RichTextBox1.AppendText("NO FEE" & vbNewLine)

End if

Hope those code could solve your problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM