繁体   English   中英

当我尝试使用 BeautifulSoup(Web Scraping)从网站提取数据时出现空列表

[英]Getting Empty list when i tried to extract data from website using BeautifulSoup (Web Scraping)

我试图从这个链接的评论中提取配置文件名称: https://www.amazon.in/Samsung-Midnight-Storage-6000mAh-Battery/dp/B0B4F52B5X/?_encoding=UTF8&pd_rd_w=4JKBg&content-id=amzn1.sym .e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_p=e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_r=NEBBF38XJRRBGK0BZBX3&pd_rd_wg=qFxtB&pd_rd_r=0f156162-4690-4ef5-9a8b-8b03e82e194b&ref_=pd_gw_ci_mcx_mr_hp_d&th=1

在 span 和 class_="a-profile-name" 下

但是当我尝试打印它时,它只是返回了一个空列表

下面是我的代码:

    
from bs4 import BeautifulSoup as bs 

import requests

link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'

soup =bs(requests.get(link).text,"html.parser")

name = soup.find_all("span",class_= "a-profile-name")



print(name)

在您的请求中发送一些标头总是一个好主意,例如user-agent

requests.get(link, headers={'User-Agent': 'Mozilla/5.0'})

注意:亚马逊真的不喜欢被刮,所以他们迟早会检测到你的活动并可能阻止你。

例子

from bs4 import BeautifulSoup as bs 
import requests

link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'

soup =bs(requests.get(link, headers={'User-Agent': 'Mozilla/5.0'}).text,"html.parser")

name = soup.find_all("span",class_= "a-profile-name")
print(name)
Output
[<span class="a-profile-name">Amazon Customer</span>, <span class="a-profile-name">Shubam Kadam</span>, <span class="a-profile-name">Aditi Sharma</span>, <span class="a-profile-name">Moris lopez</span>, <span class="a-profile-name">tana tubin</span>]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM