[英]Python selenium PhantomJS proxy
這是我的代碼:
from selenium import webdriver
proxylist=['58.12.12.12:80','69.12.12.12:80']
weblist=['https://www.google.com','https://www.facebook.com','https://www.yahoo.com','https://aol.com']
for s in range (len(proxylist)):
service_args = ['--proxy=%s'%(proxylist[s]),'--proxy-type=socks5']
driver = webdriver.PhantomJS('phantomjs.exe', service_args = service_args)
for s in weblist:
driver.get(s)
這個想法是瀏覽器首先將使用proxylist [0]轉到那些站點。 如果proxylist [0]在網站[2]上超時,則proxylist [1]將繼續對網站[3]進行處理。 我認為我應該使用try和,但不知道將它們放在哪里。 很高興您提供了幫助!
嘗試這樣的事情。 基本上,我們要切換內部和外部循環並添加try / except
for s in weblist:
for s in range (len(proxylist)):
try
service_args = ['--proxy=%s'%(proxylist[s]),'--proxy-type=socks5']
driver = webdriver.PhantomJS('phantomjs.exe', service_args = service_args)
driver.get(s)
break
except TimeOutException:
print 'timed out'
嘗試超時的嘗試是這樣的:
try:
driver.set_page_load_timeout(1)
driver.get("http://www.example.com")
except TimeoutException as ex:
print("Exception has been thrown. " + str(ex))
對於您的代碼,添加它就像:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
proxylist=['58.12.12.12:80','69.12.12.12:80']
weblist=['https://www.google.com','https://www.facebook.com','https://www.yahoo.com','https://aol.com']
def test():
temp_count_proxy = 0
driver_opened = 0
for url in weblist:
if temp_count_proxy > len(proxylist):
print("Out of proxy")
return
if driver_opened == 0:
service_args = ['--proxy={}'.format(proxylist[temp_count_proxy]),'--proxy-type=socks5']
driver = webdriver.PhantomJS('phantomjs.exe', service_args = service_args)
driver_opened = 1
try:
driver.set_page_load_timeout(2)
driver.get(url)
except TimeoutException as ex:
driver.close()
driver_opened = 0
temp_count_proxy += 1
continue
test()
請注意,好像無法獲取一個URL一樣,它將更改代理,並獲取下一個URL(根據您的要求),但不會獲取相同的URL。
如果您希望它在重試當前URL失敗后更改代理,請使用以下命令:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
proxylist=['58.12.12.12:80','69.12.12.12:80']
weblist=['https://www.google.com','https://www.facebook.com','https://www.yahoo.com','https://aol.com']
def test():
temp_count_proxy = 0
driver_opened = 0
for url in weblist:
while True:
if temp_count_proxy > len(proxylist):
print("Out of proxy")
return
if driver_opened == 0:
service_args = ['--proxy={}'.format(proxylist[temp_count_proxy]),'--proxy-type=socks5']
driver = webdriver.PhantomJS('phantomjs.exe', service_args = service_args)
driver_opened = 1
try:
driver.set_page_load_timeout(2)
driver.get(url)
# Your code to process here
except TimeoutException as ex:
driver.close()
driver_opened = 0
temp_count_proxy += 1
continue
break
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.