简体   繁体   中英

Use VBA to Copy/Paste from HTML table, Paste to Excel

I am using VBA in Excel. I am looking to copy only certain pieces of data in an HTML table using VBA. The table I am working with looks like:

<table class="RatingsTable standard" id="RatingsTable1">
                <tr>
                    <th class="top_header" colspan="16">General & Fielding Ratings</th>
                </tr>
                <tr>
                    <th class="event">Event</th><th class="season">Season</th><th class="height">Height</th><th class="weight">Weight</th><th class="rating overall" title="Overall"><span class="hidden">OV</span></th><th class="rating range" title="Range"><span class="hidden">RA</span></th><th class="rating glove" title="Glove"><span class="hidden">GL</span></th><th class="rating armstrength" title="Arm Strength"><span class="hidden">AS</span></th><th class="rating armaccuracy" title="Arm Accuracy"><span class="hidden">AA</span></th><th class="rating pitchcalling" title="Pitch Calling"><span class="hidden">PC</span></th><th class="rating durability" title="Durability"><span class="hidden">DU</span></th><th class="rating health" title="Health"><span class="hidden">HE</span></th><th class="rating speed" title="Speed"><span class="hidden">SP</span></th><th class="rating patience" title="Patience"><span class="hidden">PA</span></th><th class="rating temper" title="Temper"><span class="hidden">TP</span></th><th class="rating makeup" title="Makeup"><span class="hidden">MK</span></th>
                </tr>

                <tr class="odd">
                    <td class="event">Current</td><td class="season">36</td><td class="height">6-0</td><td class="weight">224</td><td>87</td><td>29</td><td>10</td><td>85</td><td>46</td><td>22</td><td>25</td><td>93</td><td>16</td><td>55</td><td>36</td><td>80</td>
                </tr>

                <tr class="even">
                    <td class="event">Projected</td><td class="season">-</td><td class="height">?</td><td class="weight">?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td><td>?</td>
                </tr>

                <tr class="odd">
                    <td class="event">Spring Training</td><td class="season">36</td><td class="height">6-0</td><td class="weight">224</td><td>87</td><td>29</td><td>10</td><td>85</td><td>46</td><td>22</td><td>25</td><td>93</td><td>16</td><td>55</td><td>36</td><td>80</td>
                </tr>

            </table>

The data I am looking to copy and paste is this section:

<td class="event">Current</td><td class="season">36</td><td class="height">6-0</td><td class="weight">224</td><td>87</td><td>29</td><td>10</td><td>85</td><td>46</td><td>22</td><td>25</td><td>93</td><td>16</td><td>55</td><td>36</td><td>80</td>

So, I need to copy 36, 6-0, 224, 87, 29, 10, 85, 46, 22, 25, 93, 16, 55, 36, and 80 from this particular player's table but I am unable to grab this specific data. Is anyone able to help?

I can give you a more precise method. You are interested in being able to select from parts of the table.

You can see that what you are after is the last tr in the table with id="RatingsTable1" . The last row in effect of the table.

We can use a CSS selector to access this which describes that positioning.

#RatingsTable1 tr:last-child

The above say last child tr tag of an element inside of an element with id RatingsTable1 .

There are likewise first-child and nth-child selectors.


CSS query:

CSS查询


VBA:

You apply it through the querySelector method of document

You haven't shown any code but say you were using ie then it would be

ie.document.querySelector("#RatingsTable1 tr:last-child").innerText

If you have a html document variable, eg htmlDoc, then it would be:

htmlDoc.querySelector("#RatingsTable1 tr:last-child").innerText

In Excel menu (2007/2010 or later) select " Data " tab, then select " From Web ", then enter the URL and using that arrow icons highlight the HTML document table of interest, then specify the target cell in Excel Worksheet.

You can use Macro Recorder to generate the template VBA Sub , then fine-tune it for your particular purpose. Procedure is well documented in Microsoft article: https://support.office.com/en-nz/article/Get-external-data-from-a-Web-page-708f2249-9569-4ff9-a8a4-7ee5f1b1cfba (it also describes the way to create a Web Query in Excel, that you may use).

Added per your comments : in order to reduce the size of the Table pertinent to your business logic, you can possibly create a parameterized custom web query if that external web site provides you with this option. Your most universal solution would be to populate Excel Worksheet with Web data which best fit your goal and then (upon necessity) to perform the final data trim using Excel VBA.

Just as FYI : there is also a technique of downloading/parsing entire HTML file, but I would not recommend that approach.

Hope this may help. Best regards,

It's been a while since this question was posted, but since I've been working on similar projects lately, I thought I could contribute my solution. The following method demonstrates the general logic of how to parse an HTML table using VBA and can be modified to suit the needs of any similar project. For the following function to work you would need a reference to the MS HTML Object Library.

Public Function parseTableHTML(stringHTML As String, tableID As String, rowClass As String)
    Dim sampleHTML As New MSHTML.HTMLDocument 'create an HTMLDocument object
    Dim tableHTML As HTMLTable
    Dim rowHTML As HTMLTableRow
    Dim cellHTML As HTMLTableCell
    sampleHTML.body.innerHTML = stringHTML 'set the HTMLDocument's body equal to the html code you want to parse
    Set tableHTML = sampleHTML.getElementById(tableID) 'get the element whose ID is equal to tableID (in this case the element you're interested in, is a table with tableID="RatingsTable1")
    Set rowHTML = tableHTML.getElementsByClassName(rowClass)(0) 'get the first row from the collection of rows that belong to the table of interest and their class name is rowClass (in this case rowClass="odd")
    For Each cellHTML In rowHTML.Cells 'loop through the cells that belong to the row of interest
        Debug.Print cellHTML.innerText
    Next cellHTML
End Function

By the same logic, if the table of interest doesn't have an ID, but you know that there are several tables in the html snippet and you're interested in the first one, you could get it from the collection of tables:

Set tableHTML = sampleHTML.getElementsByTagName("table")(0)

same principle applies to the row of interest, which in this case you could get from the collection of rows as:

Set rowHTML = tableHTML.getElementsByTagName("tr")(2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM