簡體   English   中英

使用Excel VBA抓取HTML

[英]Using Excel VBA to scrape HTML

我一直在嘗試抓取並解析網站上的一些財務數據,以便可以使用VBA將數據添加到Excel電子表格中。 我找到了幾種可能的解決方案,但似乎無法使它們適合我的參數。 我的問題是我只需要一個表中的變量(平均目標價格)。 我無法弄清楚我在做什么錯。 我還將使用類似的VBA格式一次檢查數百家公司,因此,如果有更有效的方法來編碼我所擁有的內容,請告訴我。

這是我到目前為止的內容:

Sub ImportAnalystEst()

Dim oHtml       As HTMLDocument
Dim oElement    As IHTMLElement

Set oHtml = New HTMLDocument

With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    .Open "GET", "http://www.marketwatch.com/investing/stock/aapl/analystestimates", False
    .send
    oHtml.body.innerHTML = .responseText
End With

Dim wsTarget As Worksheet
Dim i As Integer
i = 1
Set wsTarget = ActiveWorkbook.Worksheets("Sheet1")

For Each oElement In oHtml.getElementsByClassName("snapshot")
  wsTarget.Range("A" & i) = Split(oElement.Children(0).innerText, "<TD>")
  i = i + 1
Next

End Sub

這是我要嘗試提取的HTML。 有人可以舉一個例子,說明如何提取146.52的平均目標價格嗎?

<div class="analystEstimates">

<div class="block">
    <h2>Snapshot</h2>
</div>
<table class="snapshot">
    <tbody>
        <tr>
            <td class="first">Average Recommendation:</td>
            <td class="recommendation">
                Overweight
            </td>
            <td class="first column2">Average Target Price:</td>
            <td>146.52</td>
        </tr>
        <tr>
            <td class="first">Number of Ratings:</td>
            <td>

我可以通過以下方法解決我的問題:

Sub ImportAnalystEst()
Dim oHtml       As HTMLDocument
Dim oElement    As IHTMLElement

Set oHtml = New HTMLDocument


With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    .Open "GET", "http://www.marketwatch.com/investing/stock/aapl/analystestimates", False
    .send
    oHtml.body.innerHTML = .responseText
End With

Dim wsTarget As Worksheet
Dim i As Integer
i = 1
Set wsTarget = ActiveWorkbook.Worksheets("Sheet1")


For Each oElement In oHtml.getElementsByClassName("snapshot")
  wsTarget.Range("A" & i) = Split(oHtml.getElementsByClassName("snapshot").Item(0).FirstChild.FirstChild.innerHTML, "TD")(7)
  wsTarget.Range("A" & i) = Replace(wsTarget.Range("A" & i), ">", "")
  wsTarget.Range("A" & i) = Replace(wsTarget.Range("A" & i), "</", "")
  i = i + 1
Next


End Sub

使用CSS選擇器組合將值作為表第二列中第一行表單元格的位置來定位值要容易得多 CSS選擇器是使用"." .snapshot .first.column2 + td "." 類選擇器, " "后代組合器和"+"相鄰的兄弟組合器。

Option Explicit
Public Sub ImportAnalystEst()
    Dim oHtml       As HTMLDocument
    Dim oElement    As IHTMLElement

    Set oHtml = New HTMLDocument

    With CreateObject("WINHTTP.WinHTTPRequest.5.1")
        .Open "GET", "http://www.marketwatch.com/investing/stock/aapl/analystestimates", False
        .send
        oHtml.body.innerHTML = .responseText
    End With
    Debug.Print oHtml.querySelector(".snapshot .first.column2 + td").innertext
End Sub

這將做您想要的。

Sub Test() Dim IE As Object

Set IE = CreateObject("InternetExplorer.Application")
With IE
    .Visible = True
    .Navigate "http://www.marketwatch.com/investing/stock/aapl/analystestimates" ' should work for any URL
    Do Until .ReadyState = 4: DoEvents: Loop

        x = .document.body.innertext
        y = InStr(1, x, "Average Target Price:")
        Z = Mid(x, y, 6)

        Range("A1").Value = Trim(Z)

        .Quit
    End With
End Sub

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM