[英]Need help getting table from HTML
我已經成功地使用以下代碼從 Marketwatch.com 提取共同基金業績數據:
Dim A As Long
Dim B As Long
Dim C As Long
Dim Z As Long
For Z = 1 To 35
Range("A1").Select
ActiveCell.Offset((37 + (Z * 10)), 0).Select
If ActiveCell.Value = "" Then
Exit For
Else
End If
Dim oHTML As Object
Dim oTable As Object
Dim x As Long
Dim Y As Long
Dim vData As Variant
Set oHTML = CreateObject("HTMLFile")
With CreateObject("WinHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.marketwatch.com/investing/fund/vfinx", False
.send
oHTML.body.innerhtml = .responsetext
End With
For Each oTable In oHTML.Getelementsbytagname("table")
If oTable.classname = "fundstable" Then
ReDim vData(1 To oTable.Rows.Length, 1 To oTable.Rows(1).Cells.Length)
For x = 1 To UBound(vData)
For Y = 1 To UBound(vData, 2)
vData(x, Y) = oTable.Rows(x - 1).Cells(Y - 1).innertext
Next Y
Next x
With ActiveCell.Offset(1, 0)
.Resize(UBound(vData), UBound(vData, 2)).Value = vData
End With
Exit For
End If
Next oTable
Next Z
不幸的是,Marketwatch 添加了一個驗證碼來阻止機器人(即我)抓取他們的數據。 我不知道無論如何,所以我想我會嘗試另一個網站。
我看了看晨星: http : //performance.morningstar.com/fund/performance-return.action? t= VFINX& region= usa& culture= en_US
看起來我想要在該頁面上的表格是:“table.r_table3 width955px print97”或只是“r_table3 width955px print97”,但似乎沒有一個對我有用。
有任何想法嗎?
謝謝!
數據由 javascript 加載,無法通過 XMLHTTP 請求獲得,因為腳本不會運行來加載內容。
例如,您可以在 IE 中使用第二個鏈接並引入等待以確保加載信息。 我展示了在索引 1 處獲取具有該類名的表。您可以在此處更改索引:
ele.item(1).outerHTML
因此,對於下一個表,請使用clipboard.SetText ele.item(2).outerHTML
。
您還可以循環的.Length
的ele
得到各表,但保證你寫了不同的小區,當你粘貼:
Dim i As Long
For i = 0 To ele.Length-1
clipboard.SetText ele.item(i).outerHTML
'Etc
Next
VBA:
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer, clipboard As Object
Dim ele As Object, ws As Worksheet, t As Date, tableCount As Long
Const MAX_WAIT_SEC As Long = 5
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
With IE
.Visible = True
.navigate "http://performance.morningstar.com/fund/performance-return.action?t=VFINX®ion=usa&culture=en_US"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document
t = Timer
Do
DoEvents
On Error Resume Next
Set ele = .querySelectorAll(".r_table3.print97")
tableCount = ele.Length
On Error GoTo 0
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While tableCount < 3
If Not ele Is Nothing Then
clipboard.SetText ele.item(1).outerHTML
clipboard.PutInClipboard
ws.Cells(1, 1).PasteSpecial
End If
End With
.Quit
End With
End Sub
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.