[英]Requests-html not getting all links
I am trying to scrape a website, but it looks I can't acces all links.我正在尝试抓取一个网站,但看起来我无法访问所有链接。 The website is:该网站是:
https://www.carrefour.es/supermercado/bebidas/refrescos/colas/cat650010/c?ic_source=portal-y-corporativo&ic_medium=menu-links&ic_content=ns https://www.carrefour.es/supermercado/bebidas/refrescos/colas/cat650010/c?ic_source=portal-y-corporativo&ic_medium=menu-links&ic_content=ns
The procedure I am following is first identify each separate product, and then get the link for each product.我遵循的过程是首先识别每个单独的产品,然后获取每个产品的链接。 To my surprise I can identify all the products in the page, but I can only get the link for the first 8, althogh the others should have a link too.令我惊讶的是,我可以识别页面中的所有产品,但我只能获得前 8 个产品的链接,尽管其他产品也应该有链接。 My code is:我的代码是:
from requests_html import HTMLSession
s = HTMLSession()
url = "https://www.carrefour.es/supermercado/bebidas/refrescos/colas/cat650010/c?ic_source=portal-y-corporativo&ic_medium=menu-links&ic_content=ns"
r = s.get(url)
products = r.html.find('ul.product-card-list__list li')
for item in products:
print(item.find('a', first=True).attrs["href"])
At some point I get the following error, since I can't find the link of the product, although it exists and the product seems to be loaded:在某些时候我收到以下错误,因为我找不到产品的链接,尽管它存在并且产品似乎已加载:
AttributeError: 'NoneType' object has no attribute 'attrs'
Any hints about where the problem is?关于问题出在哪里的任何提示? Many thanks!!非常感谢!!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.