简体   繁体   中英

VBA screen-scraping via Internet explorer

Could somebody please help point me in the right direction to get my code working again? The "Old way" has worked for many years through VBA, it makes regular queries to the website below to collect the weather via Internet Explorer. However, now the minute IE is called, it opens in Edge and the code fails.

I have been trying to get the same code to work via XML v6 and have nearly succeeded with the "New way". I can get a single page to load but I need to incorporate the logic where an area code is input into a search box before a button is pressed on the website. The Input is box is HTMLDoc.getElementById("keyword") and the button is HTMLDoc.getElementsByTagName("button"). Only once the area is entered, and the button is pressed, does the weather for that region get returned.

Is it possible to make this query the New way please?

I have also learnt that this may be possible with Selenium and a web driver query in VBA. However, I heard that each time Microsoft Edge gets update, you need to download a new driver for the web driver, which seems like over-kill given how basic the query is.

Thanks in advance!

James

Old way

//References: Microsoft Internet Controls, Microsoft HTML Object Library

Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection

IE.Visible = True
IE.Navigate ("www.bom.gov.au/aviation/forecasts/taf/")
//issue is that ie does not seem to exist, think it is because this actually redirects to edge now so code crashes from this point

Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Debug.Print IE.LocationName; IE.LocationURL

Set HTMLDoc = IE.Document
Set HTMLInput = HTMLDoc.getElementById("keyword")
HTMLInput.Value = "20"
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
HTMLButtons(0).Click

New way

//References: Microsoft XML, v6.0

Dim XMLPage As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
    
XMLPage.Open "GET", "http://bom.gov.au/aviation/forecasts/taf/", False
XMLPage.send
    
HTMLDoc.body.innerHTML = XMLPage.responseText
Call ProcessHTMLPage(HTMLDoc)

Using XMLHTTP, you have to forget it about interacting with the webpage as you would in a browser as the content is static and modern website nowadays mostly generate its content dynamically.

Inspecting the website from DevTools, it appears the search trigger a POST request to http://www.bom.gov.au/aviation/php/process.php with the keyword as the data which returns the result so this is what you have to replicate.

Below is a sample code using YBWP as the keyword (change the constant or adapt to your code) and insert the output to Sheet1 starting from cell A1 :

Sub Test()
    Const searchKeyWord As String = "YBWP"
    
    Dim XMLPage As New MSXML2.XMLHTTP60
    Dim HTMLDoc As New MSHTML.HTMLDocument
    Dim requestData As String
    
    requestData = "keyword=" & searchKeyWord & "&type=search&page=TAF"
    
    XMLPage.Open "POST", "http://www.bom.gov.au/aviation/php/process.php", False
    XMLPage.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    XMLPage.send requestData 
   
    HTMLDoc.body.innerHTML = XMLPage.responseText
    
    Dim resultColl As Object
    Set resultColl = HTMLDoc.getElementsByTagName("p")
    
    Dim i As Long
    For i = 0 To resultColl.Length - 1
        ThisWorkbook.Worksheets("Sheet1").Cells(i + 1, 1).Value = resultColl(i).innerText
    Next i
End Sub

Running this will return XMLPage.responseText as follows:

<h3>WEIPA YBWP</h3><p class="product">TAF YBWP 251217Z 2514/2602<br />10006KT 9999 SCT020<br />RMK<br />T 27 25 25 30 Q 1011 1009 1011 1012</p><p class="product">METAR YBWP 251230Z AUTO 00000KT 9999 // SCT027 SCT033 BKN047 28/24<br />Q1012 RMK RF00.0/000.0</p>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM