简体   繁体   English

使用Selenium和BeautifuSoup时使用Python Errno 10054

[英]Python Errno 10054 while working with Selenium and BeautifuSoup

I'm trying to download all the PDF files which can be downloaded without login or subscription on this web page , but I got this error. 我正在尝试下载所有无需登录或订阅即可在此网页上下载的PDF文件,但出现此错误。

[Errno 10054] An existing connection was forcibly closed by the remote host

How can I solve this error? 我该如何解决这个错误?

# -*- coding: utf-8 -*-

from selenium import webdriver
from bs4 import BeautifulSoup
import urllib2 as ul

def download_pdf(file_name, download_url):
    response = ul.urlopen(download_url)
    file = open(file_name + ".pdf", 'wb')
    file.write(response.read())
    file.close()
    print("Completed")

chrome_path = r"C:\Users\HarutakaKawamura\Desktop\bs\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get('https://www.osapublishing.org/search.cfm?q=comsol&meta=1&cj=1&cc=1')

driver.implicitly_wait(10)
links = driver.find_elements_by_xpath("//a[contains(text(), 'PDF')]")
titles = driver.find_elements_by_xpath("//h3[contains(@class, 'sri-title')]")

for i in range(len(links)):

    href = links[i].get_attribute("href")
    bs = BeautifulSoup(ul.urlopen(href), 'lxml')

    if len(str(bs)) < 1000:

        download_url = bs.findAll("frame")[1]['src']
        file_name = titles[i].find_element_by_tag_name("a").text
        download_pdf(file_name, download_url)

try this: 尝试这个:

import socket
import errno

def download_pdf(file_name, download_url):
    error_counter = 0
    while error_counter <= 5:
        try:
            response = ul.urlopen(download_url)
            file = open(file_name + ".pdf", 'wb')
            file.write(response.read())
            file.close()
            print("Completed")
            break
        except socket.error as error:
            if error.errno == errno.ECONNRESET:
                error_counter += 1
            else:
                raise

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Selenium Webdriver停止与[Errno 10054] - Selenium Webdriver halting with [Errno 10054] Errno 10054,同时使用Python抓取HTML:如何重新连接 - Errno 10054 while scraping HTML with Python: how to reconnect Python FTP“ ERRNO 10054”顺序文件下载 - Python FTP “ERRNO 10054” Sequential File Download 如何用 BeautifuSoup Python 解析? - How to parse with BeautifuSoup Python? Python SocketServer.TCPServer错误号10054 - Python SocketServer.TCPServer errno 10054 ConnectionResetError:尝试使用Selenium Webdriver Python代码访问网站时出现[WinError 10054] - ConnectionResetError: [WinError 10054] while trying to access a website using Selenium Webdriver Python code 带有 MS Edge 浏览器的 Python (Selenium):&#39;Connection aborted.&#39;, ConnectionResetError(10054, ...)“Microsoft Web Driver 已停止工作” - Python (Selenium) with MS Edge browser: 'Connection aborted.', ConnectionResetError(10054, ...) "Microsoft Web Driver has stopped working" 仅当client.py运行时,Python Socket errno 10054 - Python Socket errno 10054 only when client.py runs Python urllib2 POST到短信网关[Errno 10054] - Python urllib2 POST to an sms gateway [Errno 10054] Python套接字库,客户端退出时Errno 10054 - Python Socket Library, Errno 10054 When Client Quits
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM