简体   繁体   中英

VBA scrape Table every cell

I am trying to scrape a table from a website that required logging on, inputting search option, before the table is even displayed. I managed to do the prior, however once the table is displayed I do not know how to get it onto my worksheet.

I have the table HTML location here:

IE.document.querySelector("[id='advanced_iframe']").contentDocument.querySelector("table[id=GridView1]")

and I can get the text from the table via:

IE.document.querySelector("[id='advanced_iframe']").contentDocument.querySelector("table[id=GridView1]").innerText

But I do not know how to get each cell of the table into every cell in my excel sheet ("Sheet1")

Table from the webpage:

在此处输入图片说明

Any help is appreaciated.

I can only assume you're using Windows and trying to run the VBA from inside Excel - you don't say otherwise, so here's the simplest solution that doesn't involve looping or dependencies of table format codes

You basically copy/paste the table into Excel using Excel's built in HTML translation tool and Microsoft's Clipboard

First - copy/paste Microsoft's Clipboard API functions into module

Option Explicit
Private Declare Function OpenClipboard Lib "user32.dll" (ByVal hWnd As Long) As Long
Private Declare Function EmptyClipboard Lib "user32.dll" () As Long
Private Declare Function CloseClipboard Lib "user32.dll" () As Long
Private Declare Function IsClipboardFormatAvailable Lib "user32.dll" (ByVal wFormat As Long) As Long
Private Declare Function GetClipboardData Lib "user32.dll" (ByVal wFormat As Long) As Long
Private Declare Function SetClipboardData Lib "user32.dll" (ByVal wFormat As Long, ByVal hMem As Long) As Long
Private Declare Function GlobalAlloc Lib "kernel32.dll" (ByVal wFlags As Long, ByVal dwBytes As Long) As Long
Private Declare Function GlobalLock Lib "kernel32.dll" (ByVal hMem As Long) As Long
Private Declare Function GlobalUnlock Lib "kernel32.dll" (ByVal hMem As Long) As Long
Private Declare Function GlobalSize Lib "kernel32" (ByVal hMem As Long) As Long
Private Declare Function lstrcpy Lib "kernel32.dll" Alias "lstrcpyW" (ByVal lpString1 As Long, ByVal lpString2 As Long) As Long

Public Sub SetClipboard(sUniText As String)
    Dim iStrPtr As Long
    Dim iLen As Long
    Dim iLock As Long
    Const GMEM_MOVEABLE As Long = &H2
    Const GMEM_ZEROINIT As Long = &H40
    Const CF_UNICODETEXT As Long = &HD
    OpenClipboard 0&
    EmptyClipboard
    iLen = LenB(sUniText) + 2&
    iStrPtr = GlobalAlloc(GMEM_MOVEABLE Or GMEM_ZEROINIT, iLen)
    iLock = GlobalLock(iStrPtr)
    lstrcpy iLock, StrPtr(sUniText)
    GlobalUnlock iStrPtr
    SetClipboardData CF_UNICODETEXT, iStrPtr
    CloseClipboard
End Sub

Public Function GetClipboard() As String
    Dim iStrPtr As Long
    Dim iLen As Long
    Dim iLock As Long
    Dim sUniText As String
    Const CF_UNICODETEXT As Long = 13&
    OpenClipboard 0&
    If IsClipboardFormatAvailable(CF_UNICODETEXT) Then
        iStrPtr = GetClipboardData(CF_UNICODETEXT)
        If iStrPtr Then
            iLock = GlobalLock(iStrPtr)
            iLen = GlobalSize(iStrPtr)
            sUniText = String$(iLen \ 2& - 1&, vbNullChar)
            lstrcpy StrPtr(sUniText), iLock
            GlobalUnlock iStrPtr
        End If
        GetClipboard = sUniText
    End If
    CloseClipboard
End Function

Then change your line

IE.document.querySelector("[id='advanced_iframe']").contentDocument.querySelector("table[id=GridView1]").innerText

to assign it to a string variable using outerHTML to get TABLE markup

table_html = IE.document.querySelector("[id='advanced_iframe']").contentDocument.querySelector("table[id=GridView1]").outerHTML

And then copy the table_html to the clipboard, before pasting into your starting cell for your table

SetClipboard table_html

Worksheets("Sheet1").Activate
Range("A1").Select
ActiveSheet.PasteSpecial Format:="Unicode Text", Link:=False, DisplayAsIcon:=False

Here's a tested working example:

Public Sub TestHTMLPaste()

On Error GoTo Err_TestHTMLPaste
    Const SiteURL As String = "https://www.grapecity.com/controls/activereports/download-version-history"
    
    Dim IE          As Object
    Dim BodyHTML    As String
    Dim FieldStart  As Integer
    Dim FieldEnd    As Integer
    
    Dim TableHTML   As String
    
    Set IE = CreateObject("InternetExplorer.Application")
    
    With IE
        .Navigate SiteURL
        Do While .Busy Or .ReadyState <> 4
            DoEvents
        Loop
    
        BodyHTML = .document.body.innerhtml
        
        Debug.Print BodyHTML
        
        If InStr(BodyHTML, "<table class=""gctable"">") > 0 Then
            Debug.Print "Found it"
            
            TableHTML = .document.querySelector("table[class=gctable]").outerHTML
            
            SetClipboard TableHTML
            DoEvents
            
        End If
        
        .Quit
        
    End With
    
    DoEvents
    
    If TableHTML <> "" Then
        Worksheets("Sheet1").Activate
        Range("A1").Select
        ActiveSheet.PasteSpecial Format:="Unicode Text", Link:=False, DisplayAsIcon:=False
        DoEvents
        Range("A1").Select
    Else
        MsgBox " No Table HTML found"
    End If
    
    
Err_TestHTMLPaste:
    Set IE = Nothing

End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM