I am trying to get a table from a website. The problem is that I need to login first in order to access this information.
My code is below. I have hit a road block and most of the guides I found out there do not work with this site. Appreciate your help.
Private Sub Worksheet_Change(ByVal Target As Range)
Dim KeyCells As Range
' The variable KeyCells contains the cells that will cause an alert when they are changed.
Set KeyCells = Range("H1")
If Not Application.Intersect(KeyCells, Range(Target.Address)) Is Nothing Then
' Clear contents of Sheet 1
'
Worksheets("Sheet1").Cells.Clear
Range("A1").Select
'
'Login to the website
'
Dim IE As Object
Set IE = CreateObject("InternetExplorer.application")
With IE
.Visible = True
.navigate ("https://www.gurufocus.com/forum/login.php?0")
While .Busy Or .readyState <> 4: DoEvents: Wend
.document.all("Template_GLE_Login_LoginView1_login_UserName").Focus
.document.all("Template_GLE_Login_LoginView1_login_UserName").Value = "Username"
.document.all("Template_GLE_Login_LoginView1_login_Password").Focus
.document.all("Template_GLE_Login_LoginView1_login_Password").Value = "Password"
.document.all("Template_GLE_Login_LoginView1_login_LoginButton").Click
While .Busy Or .readyState <> 4: DoEvents: Wend
Debug.Print .LocationURL
End With
'
' take the Ticker in sheet Blank cell H1
Dim Ticker As String
Ticker = Sheets("Blank").Range("H1")
URL = "URL;http://www.gurufocus.com/financials/" & Ticker
'
' get the data from the website
Range("A1").Select
With Sheets("Sheet1").QueryTables.Add(Connection:=URL, Destination:=Sheets("Sheet1").Range("$A$1"))
' .CommandType = 0
.Name = Ticker
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
' .PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
' .RefreshPeriod = 0
' .WebSelectionType = xlSpecifiedTables
' .WebFormatting = xlWebFormattingNone
' .WebTables = """Rf"""
' .WebPreFormattedTextToColumns = True
' .WebConsecutiveDelimitersAsOne = True
' .WebSingleBlockTextImport = False
' .WebDisableDateRecognition = False
' .WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End If
End Sub
First off, I would suggest that you get out of the Worksheet_Change
event macro as soon as possible. All kinds of things can go wrong while you are stuck there processing a length routine and launching an InternetExplorer object to scrape web data is one of the slowest.
Private Sub Worksheet_Change(ByVal Target As Range)
Dim KeyCells As Range
Set KeyCells = Range("H1")
If Not Application.Intersect(KeyCells, Target) Is Nothing Then
On Error GoTo bm_Safe_Exit
Application.EnableEvents = False
Worksheets("Sheet1").Cells.Clear
'if this is Sheet1's Worksheet_Change then the following
'would be more succinct and acknowledges that we are in Sheet1's bailywick
'Me.Cells.Clear
'Range("A1").Select try to work without .Select
'allow 1 second to get out of the Worksheet_Change
Application.OnTime Now + TimeSerial(0, 0, 1), "process_Web_Data"
End If
bm_Safe_Exit:
Application.EnableEvents = True
End Sub
So all that does is trap and evaluate the event. If H1 is involved, it clears Sheet1 and launches a public sub (stored in a Module sheet) and gets out of Dodge. The sub launch is a scant second after being scheduled and that should be more than enough time to exit the event macro.
In a module sheet:
I added Microsoft HTML Object library and Microsoft Internet controls to the VBE's Tools ► References for the following code.
Sub process_Web_Data()
Dim ie As New SHDocVw.InternetExplorer
With ie
.Visible = True
.navigate "https://www.gurufocus.com/forum/login.php?0"
While .Busy Or .readyState <> 4: DoEvents: Wend
With .document
.getelementbyid("txt-username").Value = "Username"
.getelementbyid("txt-password").Value = "Password"
.getelementbyid("login_form").submit
End With
While .Busy Or .readyState <> 4: DoEvents: Wend
Debug.Print .LocationURL
'-----------------
'do all of your other stuff here
'-----------------
End With
End Sub
That is sufficient to get the log in page's 'That Email/username/password was not found or is inactive. Please try again.' screen so the login process is working; just not the credentials.
As far as referencing Sheet1 from that public sub, the Worksheet.CodeName property , Worksheet.Name property or Worksheet.Index property could be used. I would probably opt for the codename.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.