簡體   English   中英

使用XMLHTTP對象解析VBA中的某些網站

[英]Using XMLHTTP object to parse some websites in VBA

我試圖從Wikipedia頁面上獲取“關鍵人物”字段: https : //en.wikipedia.org/wiki/Abbott_Laboratories ,並將該值復制到我的Excel電子表格中。

我設法使用xml http來做到這一點,這是我喜歡它的速度的一種方法,您可以看到下面的代碼正在工作。

但是,該代碼不夠靈活,因為Wiki頁面的結構可能會發生變化,例如,它在此頁面上不起作用: https : //en.wikipedia.org/wiki/3M

因為tr td結構並不完全相同(關鍵人物不再是3M頁面的第8個TR)

如何改善我的代碼?

Public Sub parsehtml()

Dim http As Object, html As New HTMLDocument, topics As Object, titleElem As Object, detailsElem As Object, topic As HTMLHtmlElement
Dim i As Integer

Set http = CreateObject("MSXML2.XMLHTTP")



http.Open "GET", "https://en.wikipedia.org/wiki/Abbott_Laboratories", False

http.send

html.body.innerHTML = http.responseText

Set topic = html.getElementsByTagName("tr")(8)

Set titleElem = topic.getElementsByTagName("td")(0)

ThisWorkbook.Sheets(1).Cells(1, 1).Value = titleElem.innerText

End Sub

如果“關鍵人物”的表行未固定,那么為什么不為“關鍵人物”循環表

我測試了以下修改,發現它正常工作。

在聲明部分

Dim topics As HTMLTable, Rw As HTMLTableRow

然后最后

html.body.innerHTML = http.responseText
Set topic = html.getElementsByClassName("infobox vcard")(0)

    For Each Rw In topic.Rows
        If Rw.Cells(0).innerText = "Key people" Then
        ThisWorkbook.Sheets(1).Cells(1, 1).Value = Rw.Cells(1).innerText
        Exit For
        End If
    Next

有更好的更快方法。 至少對於給定的URL。 匹配元素的類名,並索引返回的nodeList。 返回的項目較少,元素的路徑更短,並且與類名稱的匹配比與元素類型的匹配更快。

Option Explicit
Public Sub GetKeyPeople()
    Dim html As HTMLDocument, body As String, urls(), i As Long, keyPeople
    Set html = New HTMLDocument
    urls = Array("https://en.wikipedia.org/wiki/Abbott_Laboratories", "https://en.wikipedia.org/wiki/3M")
    With CreateObject("MSXML2.XMLHTTP")
        For i = LBound(urls) To UBound(urls)
            .Open "GET", urls(i), False
            .send
            html.body.innerHTML = .responseText
            keyPeople = html.querySelectorAll(".agent").item(1).innerText
            ThisWorkbook.Worksheets("Sheet1").Cells(i + 1, 1).Value = keyPeople
        Next
    End With
End Sub

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM