繁体   English   中英

需要帮助从 HTML 获取表格

[英]Need help getting table from HTML

我已经成功地使用以下代码从 Marketwatch.com 提取共同基金业绩数据:

Dim A As Long
Dim B As Long
Dim C As Long
Dim Z As Long


For Z = 1 To 35

    Range("A1").Select
    ActiveCell.Offset((37 + (Z * 10)), 0).Select
    If ActiveCell.Value = "" Then
    Exit For
    Else
    End If
Dim oHTML       As Object
Dim oTable      As Object
Dim x           As Long
Dim Y           As Long
Dim vData       As Variant

Set oHTML = CreateObject("HTMLFile")

With CreateObject("WinHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.marketwatch.com/investing/fund/vfinx", False
.send
oHTML.body.innerhtml = .responsetext
End With

For Each oTable In oHTML.Getelementsbytagname("table")
If oTable.classname = "fundstable" Then
    ReDim vData(1 To oTable.Rows.Length, 1 To oTable.Rows(1).Cells.Length)

        For x = 1 To UBound(vData)
        For Y = 1 To UBound(vData, 2)
            vData(x, Y) = oTable.Rows(x - 1).Cells(Y - 1).innertext
        Next Y
        Next x

    With ActiveCell.Offset(1, 0)
    .Resize(UBound(vData), UBound(vData, 2)).Value = vData
    End With
Exit For
End If
Next oTable

Next Z

不幸的是,Marketwatch 添加了一个验证码来阻止机器人(即我)抓取他们的数据。 我不知道无论如何,所以我想我会尝试另一个网站。

我看了看晨星: http : //performance.morningstar.com/fund/performance-return.action? t= VFINX& region= usa& culture= en_US

看起来我想要在该页面上的表格是:“table.r_table3 width955px print97”或只是“r_table3 width955px print97”,但似乎没有一个对我有用。

有任何想法吗?

谢谢!

数据由 javascript 加载,无法通过 XMLHTTP 请求获得,因为脚本不会运行来加载内容。

例如,您可以在 IE 中使用第二个链接并引入等待以确保加载信息。 我展示了在索引 1 处获取具有该类名的表。您可以在此处更改索引:

ele.item(1).outerHTML

因此,对于下一个表,请使用clipboard.SetText ele.item(2).outerHTML

您还可以循环的.Lengthele得到各表,但保证你写了不同的小区,当你粘贴:

Dim i As Long
For i = 0 To ele.Length-1
    clipboard.SetText  ele.item(i).outerHTML
   'Etc   
Next

VBA:

Option Explicit

Public Sub GetInfo()
    Dim IE As New InternetExplorer, clipboard As Object
    Dim ele As Object, ws As Worksheet, t As Date, tableCount As Long
    Const MAX_WAIT_SEC As Long = 5
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")

    With IE
        .Visible = True
        .navigate "http://performance.morningstar.com/fund/performance-return.action?t=VFINX&region=usa&culture=en_US"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document
            t = Timer
            Do
                DoEvents
                On Error Resume Next
                Set ele = .querySelectorAll(".r_table3.print97")
                tableCount = ele.Length
                On Error GoTo 0
                If Timer - t > MAX_WAIT_SEC Then Exit Do
            Loop While tableCount < 3

            If Not ele Is Nothing Then
                clipboard.SetText ele.item(1).outerHTML
                clipboard.PutInClipboard
                ws.Cells(1, 1).PasteSpecial
            End If
        End With
        .Quit
    End With
End Sub

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM