简体   繁体   中英

Excel VBA macro to get HTML SPAN ID value

I appreciate there are similar questions, but as a novice I find it hard to full adapt examples.

Problem Statement I want the create a macro in Excel to pull the "last updated" value found on the website https://www.centralbank.ae/en/fx-rates . Specifically this is found within their HTML code (value example also below):

<span class="dir-ltr">11 Feb 2021 6:00PM</span>

What I wanted to Repurpose The code here ( https://www.encodedna.com/excel/extract-contents-from-html-element-of-a-webpage-in-excel-using-vba.htm ) seemed to be a very clean way of launching IE in the background and then clearing down all elements thereafter. It iterates through hyperlinks which I don't need to do.

My code doesn't seem to work:

    Option Explicit

Const sSiteName = "https://www.centralbank.ae/en/fx-rates"

Private Sub GetHTMLContents()
    ' Create Internet Explorer object.
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    IE.Visible = False          ' Keep this hidden.
    
    IE.navigate sSiteName
    
    ' Wait till IE is fully loaded.
    While IE.readyState <> 4
        DoEvents
    Wend
    
    Dim oHDoc As HTMLDocument     ' Create document object.
    Set oHDoc = IE.document
    
    Dim oHEle As HTMLSpanElement     ' Create HTML element (<span>) object.
    Set oHEle = oHDoc.getElementById("dir-ltr").innerText ' Get the element ref using its ID. [A]
    

    
    ' Clean up.
    IE.Quit
    Set IE = Nothing
    Set oHEle = Nothing
    Set oHDoc = Nothing
End Sub

Once it works printing to innerText , I thought you can replace line commented by [A] with something like this but again not 100% sure how to replace:

Cells(iCnt + 1, 1) = .getElementsByTagName("h1").Item(iCnt).getElementsByTagName("a").Item(0).innerHTML

The goal is to print this SPAN CLASS ID value into a cell in an Excel worksheet (say "Sheet1").

The span tag has no ID. dir-ltr is the class. You can get all elements with a specific class with getElementsByClassName() . With the get methods with the plural Element s you create a node collection which is based by index 0. The class dir-ltr is the one and only class with this name in the document.

You can refer to it via index 0 which will be written behind the name of the node collection (like an array) or behind the method call. If you do it after the method call the node collection will be destroyed imidiatly but you get the indexed element of the list.

If you want to read the innertext you can do it directly behind the index but than you have a string, no object. I used that in the following code:

Private Sub GetHTMLContents()
  
  Const sSiteName = "https://www.centralbank.ae/en/fx-rates"
  Dim IE As Object
  
  'Create Internet Explorer object.
  Set IE = CreateObject("InternetExplorer.Application")
  IE.Visible = False ' Keep this hidden.
  IE.navigate sSiteName
  
  ' Wait till IE is fully loaded.
  While IE.readyState <> 4: DoEvents: Wend
  
  'New sheet with name "New sheet" at the end
  ThisWorkbook.Sheets.Add after:=Sheets(Worksheets.Count)
  ThisWorkbook.ActiveSheet.Name = "New sheet"
  
  ' Get the element ref using its ID. [A]
  ThisWorkbook.Sheets("New sheet").Cells(1, 1) = IE.document.getElementsByClassName("dir-ltr")(0).innerText
  
  ' Clean up.
  IE.Quit
  Set IE = Nothing
End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM