I am trying to pull some data from a HTML table, however I keep getting an ERROR Message 13 "Type Mismatch" . It may be because i am using the wrong html tag. I have been struggling on this for a few days now, as I can not work out the correct process.
This is the HTML that I am using
The Code
Dim htmlTable As Object
Dim collTD As Collection
Dim oNode As Object
Dim IE As Object
Set IE = CreateObject("InternetExplorer.application")
With IE
.Visible = True
.navigate "https://ukonlinestores.co.uk/amazon-uk-sellers/"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
Set htmlTable = IE.document.getElementById("table_1")
Set collTD = htmlTable.getElementsByTagName("td")
For Each oNode In collTD
.Range("A" & RowCount) = oNode.innerHTML
RowCount = RowCount + 1
Next oNode
End With
UPDATE AS OF TODAY MON 12TH OCTOBER 2020
I have made a few changes to the code, as I wish to navigate the paginate table and extract the data off per page. I know the Pagination code that I added works as I am currently using it on another code, my problem is that it is not navigating the pages, possible due to wrong class or ID.
HTML for NEXT page
New Code, Updated
Dim htmlTable As Object Dim collTD As Object Dim oNode As Object Dim IE As Object Dim RowCount As Long Dim currentColumn As Long ''''' NEW ''''''' Dim HTMLdoc As Object Dim nextPageElements As Object Dim pageNumber As Long '''' START ROW 5 '''''' RowCount = 5 currentColumn = 1 Set IE = CreateObject("Inte.netExplorer.application") With IE.Visible = True.navigate "https://ukonlinestores.co.uk/amazon-uk-sellers/" ' Wait for the page to fully load; you can't do anything if the page is not fully loaded Do While.readyState <> 4: DoEvents: Loop Application.Wait (Now + TimeSerial(0, 0, 2)) StartForLoop_Restart: Set htmlTable =.document.getElementById("table_1") Set collTD = htmlTable.getElementsByTagName("td") For Each oNode In collTD If currentColumn Mod 11 = 0 Then RowCount = RowCount + 1 currentColumn = 1 End If Cells(RowCount, currentColumn) = oNode.innerHTML 'Change to inner HTML to pull of url currentColumn = currentColumn + 1 Next oNode ''''' ######### NEXT PAGE AND DELAYS ###### added by me Do DoEvents ''''' Searches Number of Pages entered in Sheet3 A2 If pageNumber >= Replace(Worksheets("Sheet3").Range("A2").Value, " ", "+") Then Exit Do On Error Resume Next ''' ################### CHANGE THE CLASS OF THE NEXT PAGE ########################## ERROR IS HERE #### Set nextPageElements = HTMLdoc.getElementsByClassName("paginate_button next")(0) 'Set nextPageElements = HTMLdoc.getElementsID("table_1_next ")(0) If nextPageElements Is Nothing Then Exit Do ''' ################### CHANGE THE CLASS OF THE NEXT PAGE ########################## ERROR IS ABOVE #### ''''' Scrolls Down the Browser IE.document.parentWindow.Scroll 0&, 99999 '''''Random delay from Max number entered in Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("B2").Value)) nextPageElements.Click 'next web page Do While IE.Busy Or IE.readyState <> 4 DoEvents Loop ''''Second Random delay from Max number entered Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("C2").Value)) Set HTMLdoc = IE.document pageNumber = pageNumber + 1 GoTo StartForLoop_Restart Loop End With IE.Quit Set IE = Nothing Set HTMLdoc = Nothing Set nextPageElements = Nothing MsgBox "All Done"
Also need
Currently I don't thing the next page results are copied and pasted under the previous page results. It should be something like this. Color is only for your reff.
You should properly declare your objects as IHTML elements in order to manipulate them and use their methods.
First add a reference to Microsoft HTML Object Library
Your code becomes
Sub test()
Dim htmlTable As IHTMLElement
Dim collTD As IHTMLElementCollection
Dim oNode As IHTMLElement
Dim IE As Object
Dim RowCount As Integer
Set IE = CreateObject("InternetExplorer.application")
With IE
.Visible = True
.navigate "https://ukonlinestores.co.uk/amazon-uk-sellers/"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
Set htmlTable = IE.document.getElementById("table_1")
Set collTD = htmlTable.getElementsByTagName("td")
RowCount = 1
For Each oNode In collTD
ThisWorkbook.Worksheets(1).Range("A" & RowCount) = oNode.innerHTML
RowCount = RowCount + 1
Next oNode
End With
End Sub
With late binding you don't need a reference.
Sub Test()
Dim htmlTable As Object
Dim collTD As Object
Dim oNode As Object
Dim IE As Object
Dim RowCount As Long
Dim currentColumn As Long
RowCount = 1
currentColumn = 1
Set IE = CreateObject("InternetExplorer.application")
With IE
.Visible = True
.navigate "https://ukonlinestores.co.uk/amazon-uk-sellers/"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .readyState <> 4: DoEvents: Loop
Application.Wait (Now + TimeSerial(0, 0, 2))
Set htmlTable = .document.getElementById("table_1")
Set collTD = htmlTable.getElementsByTagName("td")
For Each oNode In collTD
If currentColumn Mod 11 = 0 Then
RowCount = RowCount + 1
currentColumn = 1
End If
Cells(RowCount, currentColumn) = oNode.innertext
currentColumn = currentColumn + 1
Next oNode
End With
End Sub
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.