[英]Python Errno 10054 while working with Selenium and BeautifuSoup
I'm trying to download all the PDF files which can be downloaded without login or subscription on this web page , but I got this error. 我正在尝试下载所有无需登录或订阅即可在此网页上下载的PDF文件,但出现此错误。
[Errno 10054] An existing connection was forcibly closed by the remote host
How can I solve this error? 我该如何解决这个错误?
# -*- coding: utf-8 -*-
from selenium import webdriver
from bs4 import BeautifulSoup
import urllib2 as ul
def download_pdf(file_name, download_url):
response = ul.urlopen(download_url)
file = open(file_name + ".pdf", 'wb')
file.write(response.read())
file.close()
print("Completed")
chrome_path = r"C:\Users\HarutakaKawamura\Desktop\bs\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get('https://www.osapublishing.org/search.cfm?q=comsol&meta=1&cj=1&cc=1')
driver.implicitly_wait(10)
links = driver.find_elements_by_xpath("//a[contains(text(), 'PDF')]")
titles = driver.find_elements_by_xpath("//h3[contains(@class, 'sri-title')]")
for i in range(len(links)):
href = links[i].get_attribute("href")
bs = BeautifulSoup(ul.urlopen(href), 'lxml')
if len(str(bs)) < 1000:
download_url = bs.findAll("frame")[1]['src']
file_name = titles[i].find_element_by_tag_name("a").text
download_pdf(file_name, download_url)
try this: 尝试这个:
import socket
import errno
def download_pdf(file_name, download_url):
error_counter = 0
while error_counter <= 5:
try:
response = ul.urlopen(download_url)
file = open(file_name + ".pdf", 'wb')
file.write(response.read())
file.close()
print("Completed")
break
except socket.error as error:
if error.errno == errno.ECONNRESET:
error_counter += 1
else:
raise
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.