[英]VB.NET getElementById
I'm stumped. 我很沮丧 I don't want to use a WebBrowser in my application, and I want to get a specific element by id.
我不想在我的应用程序中使用WebBrowser,并且想通过id获取特定元素。 my code is:
我的代码是:
Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/finance?q=NASDAQ:GOOG")
Dim response As System.Net.HttpWebResponse = request.getresponse()
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim sourcecode As String = sr.ReadToEnd()
TextBox1.Text = sourcecode
This gets me the source code. 这可以获取源代码。 But how do I get a specific element?
但是如何获得特定元素? I would think that there is an easy way to do this... Btw I don't want to use Regex, or download HTML Agility Pack.
我认为有一个简单的方法可以做到这一点...顺便说一句,我不想使用Regex或下载HTML Agility Pack。
You can make a parse table to recognize html tags, and search for id=elementname
(plus possible whitespace characters) inside the tags. 您可以创建一个解析表来识别html标签,并在标签内搜索
id=elementname
(加上可能的空格字符)。 It's not the impossible task it may seem, because you can ignore most tags and you don't have to validate the html. 这似乎不是不可能的任务,因为您可以忽略大多数标签,而不必验证html。 Just consider <>, and ignore the contents of quotes, scripts, etc. There are lots more details and it takes a little work, but it's fun programming.
只需考虑<>,而忽略引号,脚本等的内容。还有许多其他细节,需要一些工作,但是编程很有趣。
The alternative would be to download something like html agility pack, use a browser, or use a regex, which you'd like to avoid. 另一种选择是下载诸如HTML敏捷包,使用浏览器或使用正则表达式之类的东西,而这是您要避免的。
Heres a very rough idea and it does not work for BLOCK elements that need a SEPARATE closing tag (like ) but it works fine for self closing elements like 这是一个很粗略的想法,它不适用于需要单独关闭标签的BLOCK元素(例如),但对于自关闭元素(例如,
also i noted that some of tag id's are enclosed in speech marks and some are not, so you would have to tweak that possibly... 我还注意到,某些标记id包含在语音标记中,而有些则没有,因此您可能必须对其进行调整...
I just roughed this code up and copy pasted the routine to detect unenclosed id tags but it still needs work on it and could be shortened too. 我只是粗略整理了这段代码,然后复制并粘贴了例程以检测未封闭的id标记,但仍需要对其进行处理,并且也可以将其缩短。
<script runat="server">
Dim sourcecode As String
Dim bodycode As String
Dim RetVal As String
Protected Sub Page_Load(sender As Object, e As System.EventArgs)
'
LoadHttpStuff()
If Request.Form("Button1") = "Submit" Then
RetVal = MyGetElementById(Request("Text1"))
End If
End Sub
Private Sub LoadHttpStuff()
Dim request As System.Net.HttpWebRequest
Dim response As System.Net.HttpWebResponse
Dim sr As System.IO.StreamReader
Dim finishat As Long
Dim startat As Long
request = System.Net.HttpWebRequest.Create("http://www.google.com/finance?q=NASDAQ:GOOG")
response = request.GetResponse()
sr = New System.IO.StreamReader(response.GetResponseStream())
sourcecode = sr.ReadToEnd()
startat = InStr(sourcecode, "<body>")
finishat = InStr(sourcecode, "</body>") + 7
bodycode = Mid(sourcecode, startat, finishat - startat)
bodycode = LCase(bodycode)
End Sub
Private Function MyGetElementById(Id As String) As String
Dim tagstart As Long
Dim tagend As Long
Dim posx As Long
Dim item As System.Web.UI.HtmlControls.HtmlGenericControl
Dim test As Boolean
Dim letter As Char
Dim text As String
item = Nothing
test = False
text = ""
If Trim(Id) <> "" Then
'-> with SPEECHMARKS
posx = InStr(bodycode, LCase("id=" & Chr(34) & Id & Chr(34)))
If posx > 0 Then
'find start of tag
Do
posx = posx - 1
letter = Mid(bodycode, posx, 1)
If letter = "<" Then
'found tag start
tagstart = posx
Exit Do
End If
Loop Until posx < 1
If tagstart > 0 Then
posx = InStr(bodycode, LCase("id=" & Chr(34) & Id & Chr(34)))
Do
posx = posx + 1
letter = Mid(bodycode, posx, 1)
If letter = ">" Then
tagend = posx + 1
Exit Do
End If
Loop Until posx >= Len(bodycode)
If tagend > 0 Then
text = Mid(bodycode, tagstart, tagend - tagstart)
test = True
End If
End If
Else
posx = InStr(bodycode, LCase("id=" & Id))
If posx > 0 Then
'find start of tag
Do
posx = posx - 1
letter = Mid(bodycode, posx, 1)
If letter = "<" Then
'found tag start
tagstart = posx
Exit Do
End If
Loop Until posx < 1
If tagstart > 0 Then
posx = InStr(bodycode, LCase("id=" & Id))
Do
posx = posx + 1
letter = Mid(bodycode, posx, 1)
If letter = ">" Then
tagend = posx + 1
End If
Loop Until posx >= Len(bodycode)
If tagend > 0 Then
text = Mid(bodycode, tagstart, tagend - tagstart)
test = True
End If
End If
End If
End If
End If
Return Text
End Function
</script>
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<table style="width: 100%;">
<tr>
<td style="text-align:left; vertical-align: top; width: 75%;"><textarea rows="20" cols="80" style="width: 90%;" disabled="disabled"><%=sourcecode%></textarea></td>
<td style="width: 25%; text-align: left; vertical-align: top;">
<table style="width:100%;">
<tr>
<td>Element Id <input id="Text1" name="Text1" type="text" /></td>
</tr><tr>
<td> </td>
</tr><tr>
<td> </td>
</tr><tr>
<td><input id="Button1" type="Submit" value="Submit" name="Button1" /></td>
</tr><tr>
<td> </td>
</tr><tr>
<td> </td>
</tr>
</table>
</td>
</tr><tr>
<td style="width: 75%;"> </td>
<td style="width: 25%;"> </td>
</tr><tr>
<td style="width: 100%;" colspan="2"><textarea rows="20" cols="80" style="width: 75%;" disabled="disabled"><%=RetVal%></textarea></td>
<td style="width: 25%;"> </td>
</tr>
</table>
</form>
</body>
</html>
Hope it helps a little 希望能有所帮助
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.