简体   繁体   中英

Extract specific html string from html source code(website) in vb.net

Actually I have full html source code of the website ..I want to extract data between the specific div tag here is my code..

Dim request As WebRequest = WebRequest.Create("https://www.crowdsurge.com/store/index.php?storeid=1056&menu=detail&eventid=41815")
    Using response As WebResponse = request.GetResponse()
        Using reader As New StreamReader(response.GetResponseStream())
            html = reader.ReadToEnd()
        End Using
    End Using

    Dim pattern1 As String = "<div class = ""ei_value ei_date"">(.*)"
    Dim m As Match = Regex.Match(html, pattern1)
    If m.Success Then
        MsgBox(m.Groups(1).Value)
    End If

An easier approach for parsing HTML (especially from a source that you don't control) is to use the HTML Agility Pack , which would allow you to do something a little like:

Dim req As WebRequest = WebRequest.Create("https://www.crowdsurge.com/store/index.php?storeid=1056&menu=detail&eventid=41815")
Dim doc As New HtmlDocument()
Using res As WebResponse = req.GetResponse()
    doc.Load(res.GetResponseStream())
End Using

Dim nodes = doc.DocumentNode.SelectNodes("//div[@class='ei_value ei_date']")
If nodes IsNot Nothing Then
    For Each var node in nodes
        MsgBox(node.InnerText)
    Next
End IF

(I've assumed Option Infer )

Try that:

Dim pattern1 As String = "<div class\s*=\s*""ei_value ei_date"">(.*?)</div>"

or

Dim pattern1 As String = "<div class=""ei_value ei_date"">(.*?)</div>"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM