簡體   English   中英

從html頁面獲取“ href = link”並使用vba導航到該鏈接

[英]Get “href=link” from html page and navigate to that link using vba

我正在Excel VBA中編寫代碼以獲取類的href值並導航至該href鏈接(即,這是我要進入特定Excel工作表的href值,並且我想通過我的VBA自動導航至該鏈接碼。

<a href="/questions/51509457/how-to-make-the-word-invisible-when-its-checked-without-js" class="question-hyperlink">How to make the word invisible when it's checked without js</a>

我得到的結果是,我能夠得到包含標簽的類值的方法。 How to make the word invisible when it's checked without js <----這就是標題,這是我在工作表中得到的。 我想要得到的是此標題持有的href鏈接/questions/51509457/how-to-make-the-word-invisible-when-its-checked-without-js這是我想要得到的內容並在我的代碼中導航。

請幫幫我。 提前致謝

以下是完整的編碼:

Sub useClassnames()
    Dim element As IHTMLElement
    Dim elements As IHTMLElementCollection
    Dim ie As InternetExplorer
    Dim html As HTMLDocument

    'open Internet Explorer in memory, and go to website
    Set ie = New InternetExplorer
    ie.Visible = True
    ie.navigate "https://stackoverflow.com/questions"
    'Wait until IE has loaded the web page

    Do While ie.readyState <> READYSTATE_COMPLETE
        DoEvents
    Loop

    Set html = ie.document
    Set elements = html.getElementsByClassName("question-hyperlink")

    Dim count As Long
    Dim erow As Long
    count = 0

    For Each element In elements
        If element.className = "question-hyperlink" Then
            erow = Sheets("Exec").Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
            Sheets("Exec").Cells(erow, 1) = html.getElementsByClassName("question-hyperlink")(count).innerText
            count = count + 1
        End If
    Next element

    Range("H10").Select
End Sub

我在這個網站上找不到任何人問的任何答案。 請不要建議這個問題重復。

<div class="row hoverSensitive">
        <div class="column summary-column summary-column-icon-compact  ">
                                <img src="images/app/run32.png" alt="" width="32" height="32">
                        </div>
        <div class="column summary-column  ">
            <div class="summary-title summary-title-compact text-ppp">
                                        <a href="**index.php?/runs/view/7552**">MMDA</a>

            </div>
            <div class="summary-description-compact text-secondary text-ppp">
                                                                            By on 7/9/2018                                                  </div>
        </div>      
        <div class="column summary-column summary-column-bar  ">
                            <div class="table">
<div class="column">
    <div class="chart-bar ">
                                                                                                                        <div class="chart-bar-custom link-tooltip" tooltip-position="left" style="background: #4dba0f; width: 125px" tooltip-text="100% Passed (11/11 tests)"></div>
                                                                                                                                                                                                                                                                                                                                                                                                            </div>
</div>
    <div class="column chart-bar-percent chart-bar-percent-compact">
    100%'

方法①

使用XHR通過問題首頁URL發出初始請求; 應用CSS選擇器檢索鏈接,然后將那些鏈接傳遞給IE導航到


CSS選擇器以選擇元素:

您需要元素的href屬性。已經給出了一個示例。 您可以使用getAttribute,或者如@Santosh所指出的那樣,將href屬性CSS選擇器與其他CSS選擇器結合起來以定位元素。

CSS選擇器:

a.question-hyperlink[href]

查找具有parent a標簽的元素, a標簽具有question-hyperlink類和href屬性。

然后,將CSS選擇器組合與documentquerySelectorAll方法一起應用以收集鏈接的nodeList。


XHR獲取鏈接的初始列表:

我會以更快的速度將它作為XHR首先發布,並將您的鏈接收集到collection / nodeList中,然后可以在IE瀏覽器中循環。

Option Explicit
Public Sub GetLinks()
    Dim sResponse As String, HTML As New HTMLDocument, linkList As Object, i As Long
    Const BASE_URL As String = "https://stackoverflow.com"
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://stackoverflow.com/questions", False
        .send
        sResponse = StrConv(.responseBody, vbUnicode)
    End With
    sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))

    With HTML
        .body.innerHTML = sResponse
        Set linkList = .querySelectorAll("a.question-hyperlink[href]")
        For i = 0 To linkList.Length - 1
            Debug.Print Replace$(linkList.item(i), "about:", BASE_URL)
        Next i
    End With
    'Code using IE and linkList
End Sub

在上面的linkList是一個nodeList,其中包含主頁中所有匹配的元素,即問題登錄頁面上的所有href 您可以循環的.Length的的nodeList和索引它來檢索特定href如linkList.item(I)。 由於返回的鏈接是相對的,因此您需要用協議+域(即"https://stackoverflow.com"替換路徑中的“ about:部分。

既然您已經快速獲得該列表並可以訪問項目,則可以將任何給定的更新href傳遞到IE.Navigate


使用IE和nodeList導航到問題

For i = 0 To linkList.Length - 1
    IE.Navigate Replace$(linkList.item(i).getAttribute("href"), "about:", BASE_URL)
Next i

方法②

使用XHR使用GET請求發出初始請求並搜索問題標題; 應用CSS選擇器檢索鏈接,然后將這些鏈接傳遞給IE進行導航。


Option Explicit
Public Sub GetLinks()
    Dim sResponse As String, HTML As New HTMLDocument, linkList As Object, i As Long
    Const BASE_URL As String = "https://stackoverflow.com"
    Const TARGET_QUESTION As String = "How to make the word invisible when it's checked without js"
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://stackoverflow.com/search?q=" & URLEncode(TARGET_QUESTION), False
        .send
        sResponse = StrConv(.responseBody, vbUnicode)
    End With
    sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))

    With HTML
        .body.innerHTML = sResponse
        Set linkList = .querySelectorAll("a.question-hyperlink[href]")
        For i = 0 To linkList.Length - 1
            Debug.Print Replace$(linkList.item(i).getAttribute("href"), "about:", BASE_URL)
        Next i
    End With
    If linkList Is Nothing Then Exit Sub
    'Code using IE and linkList
End Sub

'https://stackoverflow.com/questions/218181/how-can-i-url-encode-a-string-in-excel-vba   @Tomalak
Public Function URLEncode( _
   StringVal As String, _
   Optional SpaceAsPlus As Boolean = False _
) As String

  Dim StringLen As Long: StringLen = Len(StringVal)

  If StringLen > 0 Then
    ReDim result(StringLen) As String
    Dim i As Long, CharCode As Integer
    Dim Char As String, Space As String

    If SpaceAsPlus Then Space = "+" Else Space = "%20"

    For i = 1 To StringLen
      Char = Mid$(StringVal, i, 1)
      CharCode = Asc(Char)
      Select Case CharCode
        Case 97 To 122, 65 To 90, 48 To 57, 45, 46, 95, 126
          result(i) = Char
        Case 32
          result(i) = Space
        Case 0 To 15
          result(i) = "%0" & Hex(CharCode)
        Case Else
          result(i) = "%" & Hex(CharCode)
      End Select
    Next i
    URLEncode = Join(result, "")
  End If
End Function
  1. 這個If element.className = "question-hyperlink" Then是無用的,因為它總是正確的,因為您getElementsByClassName("question-hyperlink")所以所有元素都絕對是question-hyperlink類。 If語句可以刪除。

  2. 您在variable element具有每個鏈接,因此不需要count 代替html.getElementsByClassName("question-hyperlink")(count).innerText使用element.innerText

所以它應該看起來像這樣:

Set elements = html.getElementsByClassName("question-hyperlink")
Dim erow As Long

For Each element In elements
    erow = Worksheets("Exec").Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
    Worksheets("Exec").Cells(erow, 1) = element.innerText
    Worksheets("Exec").Cells(erow, 2) = element.GetAttribute("href") 'this should give you the URL
Next element

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM