[英]How to bypass or catch the error of socket.timeout when checking if a website is working or not?
I have been developing a program which checks whether the website is working or not. 我一直在开发一个程序来检查网站是否正常工作。 I am fetching URLs from the excel sheet and then pasting results as True & false in the same excel sheet but for some URLs, I am getting socket.timeout error and the code isn't working after that. 我从Excel工作表中获取URL,然后在同一个Excel工作表中将结果粘贴为True和false但是对于某些URL,我收到socket.timeout错误,之后代码无效。 Here is the code: 这是代码:
import http.client as httpc
from urllib.parse import urlparse
import pandas as pd
import xlwings as xw
import smtplib
from xlsxwriter import Workbook
import socket
x=[]
df = pd.read_excel (r'xyz.xlsx')
df1=pd.DataFrame(df,columns=['URL'])
print(df1)
url_list=df["URL"].tolist()
print(url_list)
for i in url_list:
def checkUrl(i):
if 'http' not in i:
i= 'https://'+i
p = urlparse(i)
conn = httpc.HTTPConnection(p.netloc,timeout=4)
conn.request('HEAD', p.path)
try:
resp = conn.getresponse()
return resp.status<400
except requests.exceptions.RequestException:
return False
print(checkUrl(i))
x.append(checkUrl(i))
workbook = Workbook('abc.xlsx')
Report_Sheet = workbook.add_worksheet()
Report_Sheet.write(0, 1, 'Value')
Report_Sheet.write_column(1, 1, x)
workbook.close()
There are many problems in this code. 这段代码有很多问题。
try:
你执行try:
的请求try:
requests.exceptions.RequestException
can cannot be thrown by your code except子句期望您的代码不能抛出requests.exceptions.RequestException
As you are not using the requests library, but the low level http.client
, you should only expect errors from the socket library, which are all subclasses of OSError 由于您没有使用请求库,而是低级别的http.client
,您应该只期待来自套接字库的错误,这些错误都是OSError的子类
Your code could become (beware: untested): 您的代码可能会变成(小心:未经测试):
def checkUrl(i):
if 'http' not in i:
i= 'https://'+i
p = urlparse(i)
if (p.scheme == 'http'):
conn = httpc.HTTPConnection(p.netloc,timeout=4)
else:
conn = httpc.HTTPSConnection(p.netloc,timeout=4)
try:
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status<400
except OSError:
return False
In my experience this error happens when an IP address resolves to a valid hostname, but the server is no longer configured to work with that hostname. 根据我的经验,当IP地址解析为有效主机名时会发生此错误,但服务器不再配置为使用该主机名。 This results in the server ignoring your attempts at trying to connect to it. 这会导致服务器忽略您尝试连接到它的尝试。
To handle this, you should return False on timeout errors. 要处理此问题,您应该在超时错误时返回False。
import socket
try:
resp = conn.getresponse()
return resp.status<400
except requests.exceptions.RequestException:
return False
except socket.timeout as err:
return False
You will want to check for an http.client.HTTPException
instead of a requests.exceptions.RequestException
because this check that you are doing uses the http.client
library and not the requests
library. 您将需要检查http.client.HTTPException
而不是requests.exceptions.RequestException
因为您正在执行的此检查使用http.client
库而不是requests
库。 In addition, you will also want to catch all of the following errors. 此外,您还需要捕获以下所有错误。
import socket
import ssl
import http.client
try:
resp = conn.getresponse()
return resp.status < 400
except http.client.HTTPException as err:
# A connection was established, but the request failed
return False
except socket.timeout as err:
# The website no longer exists on the server
return False
except socket.gaierror as err:
# Could not resolve the hostname to an IP address
return False
except ssl.CertificateError as err:
# The SSL certificate was never configured, or it cannot be trusted
return False
except ssl.SSLError as err:
# Other SSL errors not covered by ssl.CertificateError
return False
First guess is that 首先猜测的是
resp = conn.getresponse()
should be inside the try clause. 应该在try子句中。 If that doesn't work, please add the output of the program. 如果这不起作用,请添加程序的输出。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.