[英]resquest redirection handling : error on getting data while bs4 parsing
更新:这是因为现在 javascript 重定向,所以信息不再在我用来报废的链接上。 如何在循环中获取重定向链接?
我的代码需要帮助,前段时间创建了一个代码来通过汤提取并请求解析 html 的一些信息。
现在代码出错了,我不知道为什么。
这是错误:
Exception has occurred: ConnectionError HTTPConnectionPool(host='www.mercadopublico.cl', port=80): Max retries exceeded with url:
/Procurement/Modules/RFB/DetailsAcquisition.aspx?idlicitacion=1057822-56-LR20(由NewConnectionError引起('<urllib3.connection.HTTPConnection object at 0x0000022441651250>:无法建立新连接)600次连接尝试失败:[WinError A 100连接尝试因为连接方在一段时间后没有正确响应,或者由于连接的主机没有响应而建立连接失败'))
During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: File "C:\Users\christian marcos\Desktop\googol=pow(10,100).py", line 8, in <module> soup = BeautifulSoup(requests.get(url).content)
This is the code
'
import requests
from bs4 import BeautifulSoup
idList = ['1057822-56-LR20']
data = []
for idlicitacion in idList:
url=f'http://www.mercadopublico.cl/Procurement/Modules/RFB/DetailsAcquisition.aspx?idlicitacion={idlicitacion}'
soup = BeautifulSoup(requests.get(url).content)
data.append({
'licitation_number': e.text if (e:=soup.select_one("#lblNumLicitacion")) else 'no licitation number available',
'responsable': e.text if (e:=soup.select_one("#lblResponsable")) else 'no responsable available',
'ficha': e.text if (e:=soup.select_one("#lblFicha2Reclamo")) else 'no ficha available',
'estado_licitacion': e.text if (e:=soup.select_one("#lblFicha1Estado")) else 'no estato licitation available'
})
data'
希望你能帮我
Exception has occurred: ConnectionError
HTTPConnectionPool(host='www.mercadopublico.cl', port=80): Max retries exceeded with url:
您遇到上述错误,也就是超时异常,因为端点值1057822-56-LR20
使 url invalid
例如,您的关键字创建以下 url:
http://www.mercadopublico.cl/Procurement/Modules/RFB/DetailsAcquisition.aspx?idlicitacion=1057822-56-LR20
没有在浏览器中打开,而是重定向到一个有效的 url 这是
https://www.mercadopublico.cl/Procurement/Modules/RFB/DetailsAcquisition.aspx?qs=a74C100Xv6anaInwzaaUcw==
因此,您的关键字现在无效以及创建无效的 url 这就是您收到此类错误的原因。
代码:
import requests
from bs4 import BeautifulSoup
idList = ['1057822-56-LR20']
data = []
url = 'https://www.mercadopublico.cl/Procurement/Modules/RFB/DetailsAcquisition.aspx?qs=a74C100Xv6anaInwzaaUcw=='
res = requests.get(url)
print(res)
soup = BeautifulSoup(res.content,'lxml')
data.append({
'licitation_number': e.text if (e:=soup.select_one("#lblNumLicitacion")) else 'no licitation number available',
'responsable': e.text if (e:=soup.select_one("#lblResponsable")) else 'no responsable available',
'ficha': e.text if (e:=soup.select_one("#lblFicha2Reclamo")) else 'no ficha available',
'estado_licitacion': e.text if (e:=soup.select_one("#lblFicha1Estado")) else 'no estato licitation available'
})
print(data)
Output:
<Response [200]>
[{'licitation_number': '1057822-56-LR20', 'responsable': 'SERVICIO DE SALUD CONCEPCION, Activos Fijos', 'ficha': '8', 'estado_licitacion': 'Revocada'}]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.