简体   繁体   English

VB.NET getElementById

[英]VB.NET getElementById

I'm stumped. 我很沮丧 I don't want to use a WebBrowser in my application, and I want to get a specific element by id. 我不想在我的应用程序中使用WebBrowser,并且想通过id获取特定元素。 my code is: 我的代码是:

Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/finance?q=NASDAQ:GOOG")
Dim response As System.Net.HttpWebResponse = request.getresponse()
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim sourcecode As String = sr.ReadToEnd()
TextBox1.Text = sourcecode

This gets me the source code. 这可以获取源代码。 But how do I get a specific element? 但是如何获得特定元素? I would think that there is an easy way to do this... Btw I don't want to use Regex, or download HTML Agility Pack. 我认为有一个简单的方法可以做到这一点...顺便说一句,我不想​​使用Regex或下载HTML Agility Pack。

You can make a parse table to recognize html tags, and search for id=elementname (plus possible whitespace characters) inside the tags. 您可以创建一个解析表来识别html标签,并在标签内搜索id=elementname (加上可能的空格字符)。 It's not the impossible task it may seem, because you can ignore most tags and you don't have to validate the html. 这似乎不是不可能的任务,因为您可以忽略大多数标签,而不必验证html。 Just consider <>, and ignore the contents of quotes, scripts, etc. There are lots more details and it takes a little work, but it's fun programming. 只需考虑<>,而忽略引号,脚本等的内容。还有许多其他细节,需要一些工作,但是编程很有趣。

The alternative would be to download something like html agility pack, use a browser, or use a regex, which you'd like to avoid. 另一种选择是下载诸如HTML敏捷包,使用浏览器或使用正则表达式之类的东西,而这是您要避免的。

Heres a very rough idea and it does not work for BLOCK elements that need a SEPARATE closing tag (like ) but it works fine for self closing elements like 这是一个很粗略的想法,它不适用于需要单独关闭标签的BLOCK元素(例如),但对于自关闭元素(例如,

also i noted that some of tag id's are enclosed in speech marks and some are not, so you would have to tweak that possibly... 我还注意到,某些标记id包含在语音标记中,而有些则没有,因此您可能必须对其进行调整...

I just roughed this code up and copy pasted the routine to detect unenclosed id tags but it still needs work on it and could be shortened too. 我只是粗略整理了这段代码,然后复制并粘贴了例程以检测未封闭的id标记,但仍需要对其进行处理,并且也可以将其缩短。

<script runat="server">
Dim sourcecode As String
Dim bodycode As String
Dim RetVal As String

Protected Sub Page_Load(sender As Object, e As System.EventArgs)
    '
    LoadHttpStuff()
    If Request.Form("Button1") = "Submit" Then
        RetVal = MyGetElementById(Request("Text1"))
    End If

End Sub

Private Sub LoadHttpStuff()

    Dim request As System.Net.HttpWebRequest
    Dim response As System.Net.HttpWebResponse
    Dim sr As System.IO.StreamReader
    Dim finishat As Long
    Dim startat As Long

    request = System.Net.HttpWebRequest.Create("http://www.google.com/finance?q=NASDAQ:GOOG")
    response = request.GetResponse()
    sr = New System.IO.StreamReader(response.GetResponseStream())
    sourcecode = sr.ReadToEnd()
    startat = InStr(sourcecode, "<body>")
    finishat = InStr(sourcecode, "</body>") + 7
    bodycode = Mid(sourcecode, startat, finishat - startat)
    bodycode = LCase(bodycode)


End Sub

Private Function MyGetElementById(Id As String) As String
    Dim tagstart As Long
    Dim tagend As Long
    Dim posx As Long
    Dim item As System.Web.UI.HtmlControls.HtmlGenericControl
    Dim test As Boolean
    Dim letter As Char
    Dim text As String
    item = Nothing
    test = False
    text = ""
    If Trim(Id) <> "" Then
        '-> with SPEECHMARKS
        posx = InStr(bodycode, LCase("id=" & Chr(34) & Id & Chr(34)))
        If posx > 0 Then
            'find start of tag
            Do
                posx = posx - 1
                letter = Mid(bodycode, posx, 1)
                If letter = "<" Then
                    'found tag start
                    tagstart = posx
                    Exit Do
                End If
            Loop Until posx < 1
            If tagstart > 0 Then
                posx = InStr(bodycode, LCase("id=" & Chr(34) & Id & Chr(34)))
                Do
                    posx = posx + 1
                    letter = Mid(bodycode, posx, 1)
                    If letter = ">" Then
                        tagend = posx + 1
                        Exit Do
                    End If
                Loop Until posx >= Len(bodycode)
                If tagend > 0 Then
                    text = Mid(bodycode, tagstart, tagend - tagstart)
                    test = True
                End If
            End If
        Else
            posx = InStr(bodycode, LCase("id=" & Id))
            If posx > 0 Then
                'find start of tag
                Do
                    posx = posx - 1
                    letter = Mid(bodycode, posx, 1)
                    If letter = "<" Then
                        'found tag start
                        tagstart = posx
                        Exit Do
                    End If
                Loop Until posx < 1
                If tagstart > 0 Then
                    posx = InStr(bodycode, LCase("id=" & Id))
                    Do
                        posx = posx + 1
                        letter = Mid(bodycode, posx, 1)
                        If letter = ">" Then
                            tagend = posx + 1
                        End If
                    Loop Until posx >= Len(bodycode)
                    If tagend > 0 Then
                        text = Mid(bodycode, tagstart, tagend - tagstart)
                        test = True
                    End If
                End If
            End If
        End If
    End If
    Return Text
End Function
</script>

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
  <title></title>
</head>
<body>
  <form id="form1" runat="server">
    <table style="width: 100%;">
        <tr>
            <td style="text-align:left; vertical-align: top; width: 75%;"><textarea rows="20" cols="80" style="width: 90%;" disabled="disabled"><%=sourcecode%></textarea></td>
            <td style="width: 25%; text-align: left; vertical-align: top;">
                <table style="width:100%;">
                    <tr>
                        <td>Element Id&nbsp;<input id="Text1" name="Text1" type="text" /></td>
                    </tr><tr>
                        <td>&nbsp;</td>
                    </tr><tr>
                        <td>&nbsp;</td>
                    </tr><tr>
                        <td><input id="Button1" type="Submit" value="Submit" name="Button1" /></td>
                    </tr><tr>
                        <td>&nbsp;</td>
                    </tr><tr>
                        <td>&nbsp;</td>
                    </tr>
                </table>
            </td>
        </tr><tr>
            <td style="width: 75%;">&nbsp;</td>
            <td style="width: 25%;">&nbsp;</td>
        </tr><tr>
            <td style="width: 100%;" colspan="2"><textarea rows="20" cols="80" style="width: 75%;" disabled="disabled"><%=RetVal%></textarea></td>
            <td style="width: 25%;">&nbsp;</td>
        </tr>
    </table>
</form>
</body>
</html>

Hope it helps a little 希望能有所帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM