I have a list of domain names, and many of them redirect me to the same domain. For instance... foo1.com, foo2.csm and foo3.com all deposit me at foo.com.
I'm trying to deduplicate the list of domains by writing a VBA script to load the final page and extract it's URL.
I've started from this article which retrieve's the page's title ( http://www.excelforum.com/excel-programming-vba-macros/355192-can-i-import-raw-html-source-code-into-excel.html ), but can't figure out how to modify it to get the final URL (from which I can extract the URL.
Can anyone please point me in the right direction?
Try this, need to look at .LocationURL:
Public Function gsGetFinalURL(rsURL As String) As String
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
.navigate rsURL
Do While .Busy And Not .ReadyState = 4
DoEvents
Loop
gsGetFinalURL = .LocationURL
.Quit
End With
Set ie = Nothing
End Function
I haven't tried it on a huge variety of URLs, just the one you gave and a couple of others. If it is an invalid URL it will return what is passed. You can use the code from the original function to check and handle accordingly.
Add a reference to "Microsoft XML, v3.0" (or whatever version you have)
Sub tester()
Debug.Print CheckRedirect("adhpn2.com")
End Sub
Function CheckRedirect(URL As String)
If Not UCase(URL) Like "HTTP://*" Then URL = "http://" & URL
With New msxml2.ServerXMLHTTP40
.Open "HEAD", URL, False
.send
CheckRedirect = .getOption(-1)
End With
End Function
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.