简体   繁体   中英

Importing specific web data to excel using VBA

I'm very much beginner to the VBA coding scene (web scripting is more my thing) but I have an excel based program I need to create that will import data from a intranet web based application into a spreadsheet. Here's the gist of what I'm looking to set up... In the spreadsheet the user will enter the following info: username, password, list of customer account numbers and a date range. The user will then click a "command button" that will make the following happen:

  1. Open web based program, login (based on login/password typed into spreadsheet) and navigate to the account search screen.

  2. Enter first customer account number into search field and click the "search" button to navigate to the specific customer account.

  3. Navigate to the "search activity" screen, enter the date range and click the "search activity button.

  4. Pull the data from a specific column of the activity table and import the data to the spreadsheet.

  5. If there are multiple pages of data there will be a "Next Results" button, there should be a loop to click the next results button (if it exists) and pull the same column of data from each page until the button no longer exists (no more data).

  6. Once there are no more pages of data (or if there is only one page) the macro will loop back and navigate to the account search screen and perform the same operations for each account in the list of accounts typed into the spreadsheet until there are no other accounts.

  7. Once completed (all data successfully imported to the spreadsheet) it should close the IE window.

It's a little complicated and I realize excel/vba is definitely not the best solution for performing these functions but unfortunately it's what I have to use in this instance. I've been able to piece together some VBA that does almost everything above, the problem I'm having is looping through the activity pages and pulling the data just will not work (get a wide range of errors that only confuse me more), sometimes it will pull data from the first sheet, click the "next results" button, get to the next page and throw an error or even get through two or three pages and throw an error. It doesn't make a lot of sense but the most common error is "permission denied". Also this code currently only pulls the data from one account, I was hoping once I got it working for one account it would be simple to create a loop of the entire code to have it go down the list of account numbers and do the same for each until completed. I've been stuck on this for a number of weeks and I'm really ready to toss out the whole thing and start from scratch, any help would be very very appreciated!

Below is the code I have so far...

Private Sub CommandButton1_Click()

    ' open IE, navigate to the desired page and loop until fully loaded
    Set IE = New InternetExplorerMedium
    my_url = "https://customerinfo/pages/login.jsp"
    my_url2 = "https://customerinfo/pages/searchCustomer.jsp"
    my_url3 = "https://customerinfo/pages/searchAccountActivity.jsp"

    With IE
        .Visible = True
        .navigate my_url
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    ' Input the userid and password
    IE.document.getElementById("userId").Value = [B2]
    IE.document.getElementById("password").Value = [B3]

    ' Click the "Login" button
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    ' Navigate to Search screen
    With IE
        .navigate my_url2
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    ' Input the account number & click search
    IE.document.getElementById("accountNumber").Value = [B5]
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    With IE
        .navigate my_url3
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    'Input search criteria
    IE.document.getElementById("store").Value = [C7]
    IE.document.getElementById("dateFromMonth").Value = [C10]
    IE.document.getElementById("dateFromDay").Value = [B11]
    IE.document.getElementById("dateFromYear").Value = [B12]
    IE.document.getElementById("timeFromHour").Value = [B20]
    IE.document.getElementById("timeFromMinute").Value = [B21]
    IE.document.getElementById("dateToMonth").Value = [C15]
    IE.document.getElementById("dateToDay").Value = [B16]
    IE.document.getElementById("dateToYear").Value = [B17]
    IE.document.getElementById("timeToHour").Value = [B24]
    IE.document.getElementById("timeToMinute").Value = [B25]
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    'Pulls data from activity search
    Dim TDelements As IHTMLElementCollection
    Dim TDelement As HTMLTableCell
    Dim r As Long, i As Long
    Dim e As Object

    Application.Wait Now + TimeValue("00:00:05")
    Set TDelements = IE.document.getElementsByTagName("tr")
    r = 0
    For i = 1 To 1
        Application.Wait Now + TimeValue("00:00:03")
        For Each TDelement In TDelements
            If TDelement.className = "searchActivityResultsOldContent" Then
                Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
                r = r + 1
            ElseIf TDelement.className = "searchActivityResultsNewContent" Then
                Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
                r = r + 1
            End If
        Next
        Application.Wait Now + TimeValue("00:00:02")
        Set elems = IE.document.getElementsByTagName("input")
        For Each e In elems
            If e.Value = "Next Results" Then
                e.Click
                i = 0
                Exit For
            End If
        Next e
    Next i

    Do Until Not IE.Busy And IE.readyState = 4
      DoEvents
    Loop
    IE.Quit

End Sub

So, what is happening after you've clicked on "Next..." element? Let me describe an issue I encountered. Assume the code flow as follows:

  1. Create IE instance, and navigate to some URL, eg first search results page.
  2. Make a check if the page is loaded and ready. Wait for it.
  3. Create the DispHTMLElementCollection collection of the target elements, retrieved by .document.getElementsByTagName() , etc..
  4. Loop through the elements of the collection, do some stuff.
  5. Click on the "Next ..." element. The issue is that in some cases the next page doesn't start downloading immediately after click due to some JS or XHR processing.
  6. Make a conventional check if the next page is loaded and ready. This check just allows the further code execution without any delay, since downloading of the next page has not been started immediately after click, and the current existing page is determined as next page downloaded and ready, by mistake. Simple several secs delays doesn't provide reliable way to get the ready page.
  7. Again, create the DispHTMLElementCollection collection of the elements from the existing page, instead of the next page, by mistake.
  8. Loop through the elements of the created collection. While the loop in progress, the next page starts downloading. The collection still contains the references to the objects, but actually the page with that objects has been unloaded. Thereby either attempt to access to the element of the unloaded page or due to document object is unresponsive, the operation gives "permission denied" errors.

My clue is to avoid clicking on "Next...", try to read the next page URL from .href property of the "Next..." anchor <a> element, and invoke IE.navigate to that URL, then check the page readiness.

Take a look at the example implementing that approach .

IMO the most efficient way is to use XHR, like this , this and this .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM