简体   繁体   中英

VBA copy html table data to excel worksheet

I need a VBA script that can extract local html table data to an Excel worksheet. I have some code (found it somewhere on the web) that works by using a URL link, but what I want is to be able to do it using my locally stored html file. The error is I get is 'app defined or object defined error' .

Sub HTML_Table_To_Excel() 

    Dim htm As Object 
    Dim Tr As Object 
    Dim Td As Object 
    Dim Tab1 As Object 
    
    'Replace the URL of the webpage that you want to download 
    Web_URL = "http://espn.go.com/nba/" 
    
    'Create HTMLFile Object 
    Set HTML_Content = CreateObject("htmlfile") 

    'Get the WebPage Content to HTMLFile Object 
    With CreateObject("msxml2.xmlhttp") 
        .Open "GET", Web_URL, False 
        .send 
        HTML_Content.body.innerHTML = .responseText 'this is the highlighted part for the error 
    End With 
    
    Column_Num_To_Start = 1 
    iRow = 2 
    iCol = Column_Num_To_Start 
    iTable = 0 

    'Loop Through Each Table and Download it to Excel in Proper Format 
    For Each Tab1 In HTML_Content.getElementsByTagName("table") 
        With HTML_Content.getElementsByTagName("table")(iTable) 
            For Each Tr In .Rows 
                For Each Td In Tr.Cells 
                    Sheets(1).Cells(iRow, iCol).Select 
                    Sheets(1).Cells(iRow, iCol) = Td.innerText 
                    iCol = iCol + 1 
                Next Td
                iCol = Column_Num_To_Start 
                iRow = iRow + 1 
            Next Tr 
        End With 

        iTable = iTable + 1 
        iCol = Column_Num_To_Start 
        iRow = iRow + 1 
    Next Tab1 

    MsgBox "Process Completed" 
End Sub

Not sure if i've followed the conventions, but i somehow managed to get an HTML table exported to excel successfully. Here's my vb script. Any optimizations/corrections are welcome! Thanks.

Sub Export()
rowsLength =document.all.yourHTMLTableId.rows.length
cellLength= (document.all.yourHTMLTableId.Cells.length/rowsLength) 'Because i dont know how to get no.of cells in a row,so used a simple division 

 Set crr = CreateObject("WScript.Shell")

 fileNm= "Export"
 dir= crr.CurrentDirectory&"\"&fileNm&".xlsx"
 Set objExcel = CreateObject("Excel.Application")
 Set objWorkbook = objExcel.Workbooks.Add()
 Set objWorksheet= objWorkbook.Worksheets(1)
 i=0
 j=0

 do until i=rowsLength
   do until j=cellLength
   objWorksheet.cells(i+1,j+1).value =  document.all.yourHTMLTableId.rows(i).cells(j).innerHTML
   msgbox document.all.yourHTMLTableId.rows(i).cells(j).innerHTML 
   j= j+1
   Loop
 j=0    
 i=i+1
 Loop
 objWorkbook.SaveAs(dir)
 objWorkbook.close
 objExcel.Quit
 Set objExcel = Nothing

End Sub

I had the same problem and to solve it I used the original code of the question, but instead of downloading the html, I opened the html as a text file and the result was passed to the object HTML_Content.body.innerHtml the rest of the code is same.

Sub HTML_Table_To_Excel() 

Dim htm As Object 
Dim Tr As Object 
Dim Td As Object 
Dim Tab1 As Object
Dim file as String

'Replace the file path with your own 
file = "c:\your_File.html"

'Determine the next file number available for use by the FileOpen function
TextFile = FreeFile

'Open the text file
Open file For Input As TextFile

'Create HTMLFile Object 
Set HTML_Content = CreateObject("htmlfile") 
HTML_Content.body.innerHtml = Input(LOF(TextFile), TextFile)

Column_Num_To_Start = 1 
iRow = 2 
iCol = Column_Num_To_Start 
iTable = 0 

'Loop Through Each Table and Download it to Excel in Proper Format 
For Each Tab1 In HTML_Content.getElementsByTagName("table") 
    With HTML_Content.getElementsByTagName("table")(iTable) 
        For Each Tr In .Rows 
        For Each Td In Tr.Cells 
            Sheets(1).Cells(iRow, iCol).Select 
            Sheets(1).Cells(iRow, iCol) = Td.innerText 
            iCol = iCol + 1 
            Next Td 
            iCol = Column_Num_To_Start 
            iRow = iRow + 1 
        Next Tr 
    End With 

    iTable = iTable + 1 
    iCol = Column_Num_To_Start 
    iRow = iRow + 1 
Next Tab1 

MsgBox "Process Completed" 
End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM