简体   繁体   中英

Import web data with Excel VBA

I want when I import a website URL of a product it will show the name, description, price and image URL of the product into a spreadsheet.

Here's what I have: (not the real website)

Sub Trial() Dim ieObj As InternetExplorer Dim ht As HTMLDocument
    
    Website = "https://www.amazon.com/resistencia-Avalon-cartas-empaque-original/dp/B009SAAV0C?pf_rd_r=WWESR922Z214Y10K3PHH&pf_rd_p=4dd821c0-e689-433a-a035-5e03461484eb&pd_rd_r=305599f9-5f3f-41c6-9a13-8daefd8d998c&pd_rd_w=qWHso&pd_rd_wg=BNzqC&ref_=pd_gw_unk"
    
    Set ieObj = New InternetExplorer ieObj.Visible = True ieObj.navigate Website
    
    Do Until ieObj.readyState = READYSTATE_COMPLETE DoEvents Loop
    
    Set ht = ieObj.document
    
End Sub

Additional information
Name of product: The Resistance: Avalon Social Deduction Gam
id="productTitle" class="a-size-large product-title-word-break"

Description of product: The Resistance: Avalon is a standalone game and while The Resistance is not required to play; the games are compatible and can be combined
For 5 to 10 players
Takes 30 minute playtime
(All in class = "a-list-item" but different sections)

Price: $17.12
id="priceblock_ourprice"
class="a-size-medium a-color-price priceBlockBuyingPriceString"

Image URL: https://images-na.ssl-images-amazon.com/images/I/91JhcC33dTL._AC_SY879_.jpg
img alt="The Resistance: Avalon Social Deduction Game"

You can use xhr instead of IE to fetch the aforesaid fields. It will definitely make the execution faster and save you a lot of time. I used regex only to isolate the desired image link. Make sure to add Microsoft HTML Object Library to the reference library before execution.

Sub GetContent()
    Const URL = "https://www.amazon.com/resistencia-Avalon-cartas-empaque-original/dp/B009SAAV0C?pf_rd_r=WWESR922Z214Y10K3PHH&pf_rd_p=4dd821c0-e689-433a-a035-5e03461484eb&pd_rd_r=305599f9-5f3f-41c6-9a13-8daefd8d998c&pd_rd_w=qWHso&pd_rd_wg=BNzqC&ref_=pd_gw_unk"
    Dim S$, sImage$, Matches As Object

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:79.0) Gecko/20100101 Firefox/79.0"
        .send
        S = .responseText
    End With
    
    With New HTMLDocument
        .body.innerHTML = S
        [A1] = .querySelector("h1#title > span#productTitle").innerText
        [B1] = Trim(Split(.querySelector("#feature-bullets > ul.a-unordered-list").innerText, "model number.")(1))
        [C1] = .querySelector("span[id='priceblock_ourprice']").innerText
        sImage = .querySelector("#imgTagWrapperId > img").getAttribute("data-a-dynamic-image")
    End With
    
    With CreateObject("VBScript.RegExp")
        .Global = True
        .IgnoreCase = False
        .Pattern = """(.*?)"""
        .MultiLine = True
        Set Matches = .Execute(sImage)
        [D1] = Matches(2).submatches(0)
    End With
End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM